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FINAL RE30RT: 

USE OF CXJUATERAL INFORMATION 
TO IMPROVE lANDSAT OASSIFICATION ACCURACIES 


By 


Alan H. Strahler 


This final technical report briefly summarizes the results obtained from 
a research grant entitled, "Use of Collateral Information to Improve Landsat 
Classification Accuracies," NASA Grant NSG-2377. The research supported dealt 
with three major issues: (1) the use of prior probabilities in maximum 

likelihood classification as a methodology to integrate discrete collateral 
data with continuously measured image density variables; (2) the use of the 
logit classifier as an alternative to multivariate normal classification that 
permits mixing both continuous and categorical variables in a single model and 
fits empirical distributions of observations more closely than the 
multivciriate rormal density function; and (3) the use of collateral data in a 
geographic information system as exercised to model a desired output 
information layer as a function of input layers of raster-format collateral 
and image database layers. 

A number of existing statistical techniques can be used to merge spectral 
image data with collateral information, including regression, ANOVA, MANOVA, 
ANOOVA, MANOOVA, discriminant analysis, contingency table £xnalysis, 
multivariate normal classification with or without prior probabilities, and 
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logit modeling. The choice of an apprc^riate technique depends upon the 
nature of the input and output variables — continuous, discrete, or 
categorical — and the appropriate model. The first five of these techniques 
assume a linear model with a fixed coveuciance structure for all observations. 
These restrictions are usually too narrow for renotely sensed data. 
Aooordingly, we focused our attention on the two least restrictive techniques 
that allow the merging of continuous and categorical data: multivariate 

normcil classification with prior probabilities and logistic classification. 

In the first technique, prior information concerning the expected 
distribution of classes in a final classification map is incorporated into the 
classification through the use of prior probabilities — that is, 
probabilities of occurrence for classes that are based on separate, 
independent knowledge concerning the area to be classified. The use of prior 
probabilities in a classification system is sufficiently versatile to allow 
(1) prior weighting of output classes based cxi their anticipated sizes; (2) 
the merging of continuously varying measurements (e.g., multispectral 
signatures) with discrete collateral data sets (e.g., rock type, soil type); 
and (3) the construction of time-sequenticd classification systems in which an 
eeirlier classification modifies the outcome of a later cxie. The prior 
probabilities are incorporated by modifying the decision rule employed in a 
Bayesian-type classifier to calculate a posteriori probabilities of class 
membership that are based not only on the resemblance of a pixel to the class 
signature, but also on the weight of the class that is estimated for the final 
output classification. In the merging of discrete collateral information with 
continuous spectral values into a single classification, a set of prior 
prc±)abilities (weights) is estimated for each value that the discrete 
collateral variable may assume (e.g., each rock type or soil type). When 
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multi V2u:iate normal classifications are performed, the prior probabilities 
appropriate to the particular pixel are used in classification. This reserach 
is summarized in the attached reprint, "The Use of Prior Probabilities in 
Maximum Likelihood Classification of Remotely Sensed Data,” v^ch appecured in 
Remote Sensing of Environment . This eirticle is attached as Appendix A. 

Logit modeling is a versatile technique that is well adapted to the 
classification of remotely sensed data. In this technique, whicii is sonetimes 
referred to as logistic regression, a linear compound of a vector of 
independent variables is used to predict a logit — the natural logarithm of 
ratio of the probability that a pixel belongs to a particular class to the 
probability that it does not. By simultaneously estimating logits for all 
classes for each pixel, the values of the logits can be manipulated to yield 
probabilities of membership for the pixel in each of the classes. These 
probabilities may be retained in their own right, used to select the most 
likely class for a pixel, or used in a system to estimate proportions. 

The logit model has two distinct advantages over oonventicxial Gaussian 
classification. The first of these is that the logistic model tends to fit 
asymmetrical probability distributions better than does the multivariate 
normal approximation. Second, the logit classifier allows the modeling of a 
probability density as a function of any type of variable in a linear or 
curvilinear model. Thus, it is considerably less restrictive than 
multivariate normal classification. The logit modeling work is presented in 
three papers forming Appendices B, C and D to this report. These are, 
"Incorporating Collateral Data in Landsat Classification and Modeling 
Procedures" (from the Proceedings of the Fourteenth International Symposium on 
Remote Sensing of Environment ) , "A Logit Classifier for Multi-Image Data" 

(from the Proceedings of the IEEE Workshop on Picture Data Description and 
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Managanent ) , and "The Logit Classifier — A General Maximun Likelihood 
Discriminant for Remote Sensing ;^lications” (from the Fifteenth 
International Symposium on Remote Sensing of Environment ) . 

To implement the logit classifier, two VICAR programs were coded. The 
first of these, IOGPR2, accepts training sets fron an input image and prints 
maximum likelihood values of the calibrating parameters for the training data. 
These p 2 urameters are thai input to a second program, PRCLSQ, which calculates 
the probability of class membership for each pixel in an input image and 
produces a classification map as an output image. These programs, coded in 
FQFCTRAN IV, appecir in Appendices E and F. 

The last primary task supported by this grant was the development of a 
geographic information system using both collateral and image data in a 
nontrivicil application in order to explore the operational difficulties of 
merging image and collateral data and using both to produce informaticMi as a 
desired output. This task is described in the M.A. thesis of M.A. Spanner, 
entitled "Soil Loss Prediction in a Geographic Informaticxi System Format," 
which is attached as Appendix G. In this work, soil loss due to erosion from 
rainfall was successfully predicted for the Santa Paula 7.5 minute quadrangle, 
Ventura County, California, utilizing the VICAR/IBIS image processing and 
geographic information system to simulate the Universal Soil Loss Equation 
(USLE) . 

The USLE requires as input rainfall, soil erodability, length of slope, 
slope gradient, and crop management and soil loss tolerance coefficients in 
order to predict annual soil loss in tons per acre per year. To provide these 
data, digital Landsat MSS data, USGS Digitcil Elevation Model topographic data, 
a digitized NOAA isopluvial map and digitized USDA Soil Conservation Service 
soil maps were used. Estimates of accuracy for the intermediate data planes 
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representing rainfall, soil erodabaility, length of slope, slope gradient, 
crop management and soil loss tolerance ranged from 81 to 100 percent. 
Incorporating slope and elevation information in the land use/land cover 
classification that was used to identify crops and thus crc^ management 
coefficients was noted to improve classification accuracies considerably. 
Watershed chauracteristics including slope and length of slope were effectively 
depicted using the DEM topographic data. Digitized soil maps were a powerful 
information source, allowing soil erodability and soil loss tolerance to be 
represented in an image format. 

An accuracy analysis revealed a correlation coefficient of 0.91 between 
soil loss values obtained from the developed geobased model and a sample of 
manually derived soil loss values. In addition, the system accurately 
targeted soil loss problem areas for subsequent analysis by Soil Conservation 
Service personnel. Useful statistics from the soil maps, topograjAiic data and 
land cover were also obtained. In summary, the maps and statistics produced 
in this application of geograhic information systems technology have a high 
potential to provide useful information to resource managers dealing with the 
problem of soil loss in agricultural areas. 

In addition to the articles and theses identified in the foregoing 
narrative, NASA Grant NSG-2377 supported small portions of other theses and 
publications. The list of publications below includes these materials as 
well. 


List of Publications 
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The Use of Prior Probabilities in 

Maxiiniiin Liltelihood Classification of Remotely Sensed Data 


ALAN H. STRAHLER 

(/ntemily of Cobfomia, Sonlo Barbara, CoHfomia 


The expected distribution of denes in a final daarificatioo map can be used to Improve classification accuracies. Prior 
infonnatiao to i ncorporated through the use of prior probabilities- that is, probabilities of occ ur rence of rlassra whidi 
are ba s ed on separate, independent knowledge conceming the area to be classified The use of prior probabilities in a 
dassificatioo system is sufficiently versatile to allow (1) prior weighting of output classes baaed on their anticipated 
sizes; (2) the merging of continuously varying measurenoents (multiapectral signatures) widi discrete collateral 
infarmatioo datasets (e.g„ rock type, soil type); asid (3) the oonstrucbon of ttme^equantial dassiftcation systems in 
which an earlier classification modifies the outcome of a later one. Tbe prior probabilities ate incorporated by 
modifying the maximum likelihood dedskm rule employed in a Bayeaian-type classifier to calculate a pottmiori 
probabilities of class membership which are based not only on the resemblance of a pixel to the daas signature, but 
also on the weight of the clast which is estimated for the final output clasafication. In die merging of di s c r e t e 
collateral information with continuous spectral values into a single dassificatioo, a set of prior probabilities (weights) 
is estimated for each value which the discrete collateral variable may ass ume (e.g., each rock type or toil type). When 
maximum likelihood calculations are performed, the prior probabilities appropriate to the particular pixel are used in 
classification. For time-sequential classification, the prior classification of a pizd indexes a set of appropriate 
conditional probabili t ies reflecting either the confidence of the investigator in the prior classification or the extent to 
which the prior class identified is likdy to change during the time period of interest. 


Introduction 

In the past ten years, maximum likeli- 
hood clas^cation has found wide appli- 
cation in the field of remote sensing. 
Based on multivariate normal distribu- 
tion theory, the maximum likelihood clas- 
sification algorithm has been in use for 
applications in the social sciences since 
the late 1940s. Providing a probabilistic 
method for recognizing similarities be- 
tween individual measurements and pre- 
defined standards, the algorithm found 
increasing use in the field of pattern 
recognition in the following decades 
(Chow, 1957; Sebestyen, 1962; Nilsson, 
1965). In remote sensing, the develop- 
ment of multispectral scanning technol- 
ogy to produce layered multispectral dig- 
ital images of land areas from aircraft or 


spacecraft provided the opportunity to 
use the maximum likelihood criterion in 
producing thematic classification maps of 
large areas for such purposes as land- 
use /land-cover determination and natu- 
ral cultivated land inventory (Schell, 
1972; Reeves et al., 1975). 

In the last decade, research on the 
general use of classification algorithms in 
remote sensing has centered in two areas: 
(1) computational improvements in 
evaluating maximum likelihood and dis- 
criminant function decision rules; and (2) 
the use of various unsupervised clustering 
algorithms to extract repeated or com- 
monly occurring measurement vectors 
which are characteristic of a particular 
multispectral scene. Computational im- 
provements have included such develop- 
ments as look-up table schemes (Schlien 
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and Smith, 1975) to reduce repeated 
calculation, and hybrid classifiers (Ad- 
dington, 1975) which use parallelepiped 
algorithms (Goodenough and Schlien, 
1974) first, then turn to maximum likeli- 
hood computation to resolve ambiguities. 
Although important for small image 
processing systems, further computa- 
tional improvements will become less and 
less cos^ effective as real-time computa- 
tional costs continue to ^1 due to the 
development of fourth- and fifth - 
generation computer hardware systems. 

Unsupervised methods rely on cluster- 
ing measurement vectors according to 
some set of distanc'e, similarity, or disper- 
sion criteria. Many clustering heuristics 
have been devised and applied in image 
processing. Dubes and Jain (1976) pro- 
vide a review and comparative analysis of 
a number of techniques which are com- 
monly applied in pattern recognition. 
However, as Kendall (1972, p. 291) points 
out, clustering is a subjective matter to 
which little probabilistic theory is appU- 
cable. No clustering algorithms have as 
yet come to the fore which can incorpo- 
rate prior knowledge in a formal fashion 
(except for the use of a priori starting 
vectors in interactive clustering) with an 
expected increment in class identification 
accuracy produced by the use of this 
additional information. However, recent 
developments involving guided clustering 
and automated labeling of unsupervised 
clusters blur the distinction between su- 
perNTsed and unsupervised techniques. 
Fuhire work may well produce a con- 
tinuum of intergrading methods from 
which a user can select a mix apprr^nriate 
to the spatial, spectral, and temporal res- 
olution of the data in hand and informa- 
tion output desired. 


The purpose of this paper is to show 
how the use of prior information about 
the expected disMbution of classes in a 
final classification map can be used in 
several different models to improve clas- 
sification accuracies. Prior information is 
incorporated throi’.gh the use of prior 
probabilities — that is, probabilities of oc- 
currence of classes which are based on 
separate, independent knowledge con- 
cerning the area to be classified. Used in 
their simplest form, the probabilities 
wei^t the classes according to their ex- 
pected distribution in the output dataset 
by shifting decision space boundaries to 
produce larger volumes in measurement 
space for classes that are expected to be 
large and smaller volumes for classes ex- 
pected to be small. 

The incorporation of prior probabili- 
ties into the maximum likelihood decisiori 
rule can also provide a mechanism for 
merging continuously measured observa- 
tions (multispectral signatures) with dis- 
cretely measiu-ed collateral variables such 
as rock type or soil type. As an example, 
consider an area of natural vegetation 
underlain by two distinctive rock types, 
each of which exhibits a unique mix of 
vegetation classes. Two sets of prior 
probabilities can be devised, one for each 
rock type, and the classifier can be mod- 
ified to use the appropriate set of prior 
probabilities contingent on the underly- 
ing rock type. In this way, the classifica- 
tion process can incorporate discrete col- 
lateral information into the decision rule 
through a model contingent on an ex- 
ternal conditioning variable. The method 
can also be extended to include two or 
more such discrete collateral datasets: the 
number is limited only by tlie ability to 
estimate the required sets of prior proba- 
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bilities. Thus, prior probabilities provide 
a powerful mechanic for merging col- 
lateral datasets with multispectral images 
for classification purposes. 

Another application of prior probabili- 
ties contingent upon a collateral dataset 
allows temporal wei^ting in a time- 
sequential classification system. As an ex- 
ample, consider distinguishing between 
two crop types which, through differing 
phenologies, can be easily separated early 
in the season but are confused later on in 
the growing period. Through the use of 
prior probabilities, a midsummer classifi- 
cation can “look backward” to a spring 
classification to resolve ambiguity. Thus, 
winter wheat could be separated from 
spring wheat at midseason by its distinc- 
tive early spring signature. This use of 
temporal information provides an alter- 
native to the calculation of transformed 
vegetation indexes (TVls) and compara- 
ble procedures (Richardson and Wiegand, 
1977) in the identification of crops with 
multitemporal images (Rouse et al., 1973). 
Such a time-sequential classification sys- 
tem could also be used to monitor land 
use change. In this case, a Markov-type 
predictive model is used directly to set 
prior probabilities ba.sed on patterns of 
change shown in an area. 

Review of Mazimur.i 
Likelihood Classification 

To understard the application of prior 
probabilities to a classitication problem, 
we must first briefly review the mathe- 
matics of the maximum likelihood deci- 
sion rule. For the multivariate case, we 
assume each observation X (pixel) con- 
sists of a set of measurements on p vari- 
ables (channels). Through some external 


procedure, we identify a set of observa- 
tions which correspond to a class — that 
is, a set of similar objects characterized 
by a vector of means on measurement 
variables and a variance -covariance ma- 
trix describing the interrelationships 
among the measurement variables which 
are characteristic of the class. Although 
the parametric mean vector and disper- 
sion matrix for the class remain un- 
Imown, they are estimated by the sample 
means and dispersion matrix associated 
with the object sample. 

Multivariate normal statistical theory 
describes the probability that an observa- 
tion X will occur, given that it belongs to 
a class k, as the following function: 

( 1 ) 

(Table 1 presents an explanation of the 
symbols used in this and other expres- 
sions.) The quadratic prxluct 

X*-(X-m,)'S.-'(X-m.) (2) 

can be thought of as a squared distance 
function which measures the distance be- 
tween the observation and the class mean 
as scaled and corrected for variance aiKi 
covariance of the class. It can be shown 
that this expression is a x* variate with p 
degrees of freedom (Tatsuoka, 1971). 

As applied in a maximum likelihood 
decision rule, expression (1) allows the 
calculation of the probability that an ob- 
servation is a member of each of k classes. 
The individual is then assigned to the 
class for which the probability value is 
greatest. In an operational context, we 
substitute observed means, variances, and 
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TABLll Notebon 


Tim 

DiFiwmow 

P 

Number ol meMuremeol virieblei uied to dwncterln each 
object or obeervebon. 

X 

A p-dtnwnboiial random vector. 

X, 

Vector of maaeuramenti on p varlablm amoriated with the ith 
object or obaarvaboo; <>1.8 N, 

r[x,) 

Probability that a p-dimentionel random vector X win take on 
obaerved vahiaaX,. 

**» 

Marober of the libi Mt of clamea w; hw 1,8...., X. 


Member of the /th let of itatea for a oondlbonin| variable rj 
i- 1.8 /. 

F{«») 

Probability btat an ofaaervabon will be a member of clam 
prior probability for clam w^. 


Probability that aa obiervabon wiO be amociated with itala f of 
coodiboning variable prior probability for date . 
Probability that an obaervabon it a member of clam given 
that meaturement vector X, it obaerved. 


Probability denbty value amociated with obaervabon vector X, m 
evaluated for clam k. 


Parametric mean vector amoriated with the Icth clam. 

■» 

Mean vector araodated with a tample of obaervaboni beioogiiig 
to the ith clam; taken m an etbinator of 

I. 

Parametric p by p ditperaioa (variance<ovariaioe) matrii 
amociated with the ith dam. 


p by p disperaion matru amociated with a rample of obaervabont 
belonging to the ith clam; taken m an eabmalor of Z 4 . 


covariances and use the log form of ex- 
pression (1) 

ln[<I>*(Xj]- -ipln(27r)-iln|2J 

-i(X -m*)'D*-»(X -mj. 

( 3 ) 

Since the log of the probability is a 
monotonic increasing function of the 
probability, the decision can be mads by 
comparing values for each class as calcni- 
lated from the right hand side of this 
equation. This is the decision rule that is 
used in the currently distributed versions 
of LARSYS and VICAR, two image 
processing program systems authored re- 
spectively by the Laboratory for Applica- 
tions of Remote Sensing at Purdue Uni- 


versity and the Jet Propulsion Laboratory 
of California Institute of Technology at 
Pasadena. A simpler decision rule, Rj, 
can be derived from expression (3) by 
eliminating the constants (Tatsuoka, 
1971): 

Rj: Choose k which minimizes 

Fi.*(Xj- 

ln|DJ + (X(-mk)'D*~'(X,-mJ. 

(•*) 

The Use of Prior Probabilities 
in the Decisio.i Rule 

The maximum likelihood decision rule 
can be modified easily to take into 
account prior probabilities which de- 
scribe how likely a class is to occur in the 
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population of observations as a whole. 
The prior probability itself is simply an 
estimate of the proportion of the ofa^ts 
which will fall into a particular class. 
Tfiese prior probabilities are sometimes 
termed “weights," since the modified 
classification rule will tend to wei^ more 
heavily those classes with hi^er prior 
probabilities. 

Prior probabilities are iiKorporated 
into the classification throu^ manipula- 
tion of the Law of Conditional Probabil- 
ity. To begin, we define two probabili- 
ties: the probability that an 

observation will be drawn from class Ui,', 
and P{X|}, the probability of occiirrence 
of the measurement vector X^. The Law 
of Conditional Probability states that 


is usually assumed to be Gaussian. Note 
that in some cases this assumption may 
not hold; as an example, Brooner et al. 
(1971) showed significantly higher classi- 
fication acctuacies for crops using simu- 
lated multispectral imagery with direct 
estimates of these conditional probabili- 
ties than with probabilities ^culated 
according to Gaussian assiunptions. How 
ever, the use of the multivariate normal 
approximation is widely accepted, and, in 
any case, it is only under rare circum- 
stances that sufficient data are obtained 
to estimate the conditional probabilities 
directly. 

Thus, we can assume that P(XJwt) is 
acceptably estimated by 4>t(X j and re- 
write expression (6) as 




••(X,) ■ 


( 5 ) 


♦.(X.) 


P(“..X.) 
P(«.) • 


m 


The probability on the left-hand side of 
this expression will form the basis of a 
modifi^ decision rule, since we wish to 
assign the ith observation to that class 
which has the highest probability of oc- 
currence given the p-dimensional vector 
X, which has been observed. 

Again using the Law of Conditional 
Probability, we find that 

In this expression, the left-hand term 
describes the probability that the mea- 
surement vector will take on the values 
X^ given that the object measured is a 
member of class This probability 
could be determined by san.pling a popu- 
lation of measurement vectors for 
observations known to be from class u/,; 
however, the distribution of such vectors 


Rearranging, we have 
PK,X,)-*,(X,)P(<.,,)-4.f(X,). 

( 8 ) 

Thus, we see that the numerator of 
expression (5) can be evaluated as the 
product of the multivariate density func- 
tion 4>k(X|) and the prior probability Oi' 
occtirrence of class ui,. 

To evaluate the denominator of 
expression (5), we note that for all k 
classes the conditional probabilities must 
sum to 1: 

i p(u,ix,)-i 

p{x.) r 


( 9 ) 
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Therefore, 

P{X,}- i (10) 

k-l 

Substituting Eqs. (8) and (10) into (5), 



The last expression, then, provides the 
basis for the decision rule which includes 
prior probabilities. Since the denomina- 
tor remains constant for all classes, the 
observation is simply assigned to the class 
for which 4>jf(X,), the product of 4*t(X,) 
and ?{(<}(}, is a maximum. In its simplest 
fonn, this decision rule can be stated as: 
R 2 - Choose k which minimizes 


F,,.(X,)-ln|D,| 

+ (X,-n.,)'D,-‘(X ,-m.) 
-21nP(„,). (12) 

(Tatsuoka and Tiedeman, 1954). 

It is important to understand how this 
decision rule behaves with different prior 
probabilities. If the prior probability 
P(wi^} is very small, then its natural loga- 
rithm will be a large negative number; 
when multiplied by — 2, it will become a 
large positive number and thus for 
such a class will never be minimal. 
Therefore, setting a very small prior 
probability will effectively remove a cla^s 
b'om the output classification. Note that 
this effect will occur even if the observa- 
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tion vector X^ is coincident with class 
mean vector m|^. In such a case, the 
quadratic product distance function (X, 
— mj)'Dj ‘(Xj-m*) goes to zero, but the 
prior probability term — 21nP((i^j^} can 
still be large. Thus, it is entirely possible 
that the observation will be clasihed into 
a different class, one for which the dis- 
tance function is quite large. 

As the prior probability P{wi^} be- 
comes large and approaches 1, its loga- 
rithm will go to zero and will ap- 
proach Fi ll for that class. Since this 
probab'lity and all others must sum to 
one, however, the prior probabilities of 
the remaining classes will be small num- 
bers and their values of Fj^j will be 
greatly augmented. The effect will be to 
force classification into the class with 
high probability. Therefore, the more ex- 
treme are the values of the prior proba- 
bilities, the less important are the actual 
observation values X^. This point is dis- 
cussed in more detail in a following sec- 
tion. 


Numerical example 

A simple numerical example may 
clarify this modification of the maximum 
likelihood decision rule. For this exam- 
ple, we assume two classes to | and 102 in a 
two- dimension;* 1 measurement space. 
Their means and dispersion matrices are 
shown below. 


m,-[4 

2]. 

d ,-[5 

4 

ej’ 


5 

7. ■ 


3]. 

(13) 


The determinants and inverses of these 
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matrices are 


ID, 1-2, !I^|-3, 



[3 3 


(14) 


For this example, we wish to decide to 
^»duch class the measurement vector (4,3) 
belongs. 

To evaluate the probability associated 
with Up we firT' evaluate the quadratic 
product 


X*-(X-m,)'D,-‘(X-m,) (15) 


xf-[o 




(16) 


The probability density value is then 


<I>,(X) - X — -0.0532. 

‘ 2ir y/2 

(17) 

Similarly, for the second class, 

xl-L-l 0]x 

-2 


I — 5 


-1 

3 3 

X 


_ 5 4 

3 3 


0 


(18) 


<*>2(X)- 7^ X — Xe-i’^*- 0.0338 
’ V3 


(19) 


in class Uf, and it would be aj^ropriate 
to classify the measurement into (>)|. 

This same decision can be made by 
using the somewhat simpler decision rule 
Ri [expression (4)]: 

F|.i(X)-ln|Di| + (X-m,)'Df‘(X-m,). 

( 20 ) 

-b(2) + 1-2.193; (21) 

F,,,(X)-ln|Di| + (X-m*)'Dj-*(X-m2). 

( 22 ) 

-ln(3) + 2-3.098. (23) 

Here ^ain the decision is made to clas- 
sify the observation X into coi. 

The foregoing calculations assume 
equal probability of membership in U| 
and « 2 . Removing this restriction, we 
take prior probabilities into account. As- 
sume that the following prior probabili- 
ties are observed: 

P(«,)-l, P(<-,)-|. (24) 

Recalling the notation from expression 
(11) that ^f(XJ denotes the probability 
density function adjusted for the prior 
probability, 

*f(X,).*,(X.)P(u,,). (25) 

we calculate for the two classes 

4>f(X)-<l>,(X)P{a,,}-(0.0532)xl 
-0.0177, (26) 

<I»j*(X)-4>2(X)P{*.,2}-(0.0338)x I 


Thus, the measurement vector (4, 3) has a —0.0225. (27) 

higher probability associated with mem- 
bership in class Wj than with membership The actual conditional probabilities [ex- 
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pression (11)] then become 

. 1^ _/. _ 0.0177 

P{«i|Xj“(4,2)}- 0.0177+0.0225 

-0.440, (28) 

Bf IV lA on 00225 
*^{‘^*l^'“('*’2))“ 0.0177 + 0.0225 

-0.560. (29) 

Thus, class is favored for the observa- 
tion (4,3) over class Wj. 

In terms of the decision rule Rz, we 
calculate 

F 2 .,(X)-ln|D,| 

+ (X-m,)'Df‘(X-mi) 
-21nP{«,}, (30) 

-Fi.,(X)-21nP{w,), (31) 

-2.193-21n(i)-4.390. (32) 

For the second class, the outcome is 

F2.2(X)-3.098-21n(§)-3.908. (33) 

Since the observation is classified into the 
class which minimizes the value of R 2 , 
once again the second class is chosen. 

Prior Probabdities Contingent on a 
Single External Conditioning Variable 

Having shown how to modify the deci- 
sion rule to take into account a set of 
prior probabilities, it is only a small step 
to consider several sets of probabilities, 
in which an external information source 
identifies which set is to be used in the 
decision rule. As an example, consider 
the effect of soil type on the distribution 
of crops that are likely to be grown in an 
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area. In such a case, a single suite of 
crops will characterize the entire area, 
but the expected distribution of crops 
from one soil type to the next could be 
expected to vary considerably. Under 
these circumstances, it would be possible 
to collect a stratified random sample of 
the area to be classified, in order to 
quantify two sets of prior probabilities: 
one for the crops on the first soil type, 
the other for crops on the second. 

Thus, we introduce a third variable i'^, 
which indicates the state of the extern^ 
conditioning variable (e.g., soil type) as- 
sociated with the observation. VVe wish, 
then, to find an expression describing 

P{<„,|X,.|.,), (34) 

the probability that an observation will 
be a member of the class given its 
vector of observed measurements and the 
fact that it belongs to class of the 
external conditioning variable. 

In deriving an expression to find this 
probability, we can make the assumption 
that the mean vector and dispersion ma- 
trix of the class will be the same regard- 
less of the state of the external condition- 
ing variable. This assumption is discussed 
more fully in a later section. The assump- 
tion implies that 

P(X,|u,)-P{X,k,.,.,). (35) 

Expanding both sides of this relationship 
using the Law of Conditional Probability, 

P(X,^ _ 

P(«.,) P(<0„»,) • 

Solving for the three-way joint probabil- 
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P{X„u„r,) 


F(".) 

(37) 




Substituting expression (8) into the left- 
hand tenn of the right denominator. 
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<I«,(X,)P{«„»,) 

K 


2*,(x.)PK.r,) 

k-i 


*r(x.) 
1 ♦nx,) 

k-i 


(«) 


P{X,.k.„r,) 


»k(X.)P(»k)P(«k.>|) 

P(“k) 


(38) 


-<P,(X,)P{u,.,,) 




(39) 


Expanding expression (34) ai'cording to 
the Law of Conditional Probability, 


P(«k|X,.k,) 


P(“k.X,.i’|) 

P{X..-,) 


( 40 ) 


This result is analogous to expression (11); 
note that the denominator remains con- 
stant for all k, and need not actually be 
calculated to select the class for which 
4>|f*(X,) is a maximum. 

The application of this expression in 
classification requires that the joint prob- 
abilities P(Wk, be known. However, a 
simpler fonn using conditioiud probabili- 
ties directly obtained from a stratified 
random sample can be obtained through 
the application of the Law of Conditional 
Probability: 

P{<.;,.|.,)-P{u,|,,)P(,|). («) 


Noting that since all clas.ses are included, 
expres.sion (40), when summed over all 
classes, must equal 1, 


K 


2 p{k..ix„k,)-i- 

k-I 


2P("k.X,.r,) 

* 'p(x,.-,l 

(41) 


Rearranging, 

K 

P(X,.^)- 2 P|c.J„X,.,,). (42) 
Substituting expres.sions (39) and (42) into 


Since P{»’^) cancels from the numerator 
and denominator after substitution, we 
have 




2 4»*(XjP{u,Jr,} 
4-1 


(45) 


Thus, either the joint or conditional 
probabilities may be used in the decision 
rule: 

Rj: Choose it which minimizes 


F3.*(Xj-ln|Dj 

+ (X,-mJ'Di-'(X,-mJ 
-21nP{tk)4, 1',} (46) 
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A 3 : Choose k whidi minimizes 

R..(X.)-ll.|D.| 

+ (X-n..)’D.-‘(X,-m,) 
-2lnP(<o,|r,). (47) 

Numerical eaample 

To illustrate this use of prior probabili- 
ties contingent on an external condition- 
ing variable, let use return to the two- 
class example discussed earlier. This time, 
however, let us assume that a stratified 
random sample of the area to be classi- 
fied produces the estimates of probabili- 
ties shown in Table 2. The conditioning 
variable has two states: Vi and i> 2 . 
Under the conditions of both classes 
have equal prior probabilities; under I'j, 
the second class is more likely to appear, 
with the probability of 0.7 for U 2 and 0.3 
for Cl) j. Table 3 presents the calculations 
for this example. For fi, U] would be the 
most likely choice. In the case of (i }2 is 
more probable. 


TABLE t Simple Prior ProbahiliUe* 
for Numerical Example 


CONOmONINC Vajuabli 

PnoMABiuTr 

'i 

»i 


0.5 

OJ 

»{«t) 

0.5 

0.7 
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Addine Additional Conditioning 
Variables 

Logic analogous to that of the preced- 
ing section shows that classification deci- 
sions may be made contingent on any 
number of external multistate condition- 
ing variables. However, a separate set of 
prior probabilities must be estimated for 
all possible states of conditioning varia- 
bles. For example, consider classifying 
natural vegetation in an area containing 
four distinctive rock types, six different 
soil types, and four unique topographic 
habitats. Ninety-six sets of prior probabil- 
ities will then be required. Estimating 
these probabilities by separate samples 
would be prohibitive for such a large 
number of combinations. 

To alleviate this problem, it is possible 
to model these probabilities from a much 
snudler set under the assumption of no 
high-level interaction. This procedure 
amounts to the calculation of expected 
values for a multidimensional con- 
tingency table when only certain margi- 
nal totals are known. Techniques for such 
modeling have been described in the re- 
cent statistical literatiue, and are sum- 
marized in two current books by Bishop 
et al., (1975) and by Upton (1978). [Other 
treatments appear in Cox (1970) and 
Fienbuig (1977).] The discussion below is 
based partly on the treatment presented 


TABLE 3 of .Mazimiiffl UkellKood Poxterior Probxbditiex 
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»i 


"1 

w, 

«i 


0.0632 

0.0038 

0.0632 

0.0038 


0.0266 

0.0160 

0.0160 

0.0237 


0.0266 

0.0160 

0.0160 

0.0237 


0.0435 

0.0307 


0.611 

0080 

0.403 

0007 
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in pp. 57-101 of Bishop et al., and the 
reader is referred to these works for cases 
involving modeling beyond the tpvariate 
case presented here. 

As a simple example, consider the 
three-way case in which a measurement 
is a member of class u^ and is also associ- 
ated with two conditioning variables 
and Of. Then the Law of Conditional 
Probability states 


P{“fckf.O/} 


-jc • 


since 


P{Pf,0,} 


K 

2 


k-l 




Rescaling then proceeds for another set 
of two-way conditionals. 


/ 

xi=i 

P{«k-o,} ’ 

(50) 


and finally for the last set. 


L 

2 P2{«k.''^.0/} 

X 

P{Wk.»'/} 

(51) 


There are KxJxL probabilities of the 
form P{w*, Pf, Of), and we wish to esti- 
mate these with maximum likelihood 
without sampling the full set. Such an 
estimate is possible, assuming no three- 
way interaction between u, p, and o, if 
probabilities of the forms P(«t, pf), 
P(w 4 ,Of), and P{Pf, Of) are known. 

The method, firet described by Dem- 
ing and Stephan (1940), requires iterative 
fitting of three-way probabilities 
P(wj,Pf, Of} to conform with observed 
two-way probabilities P{wt, Pf), P{<*>j,Of} 
and P(Py,Of}. Beginning with an initial 
starting probability Po{<>>fc. Py. Oy), the 
individual three-way probabilities are first 
rescaled to conform with one set of two- 
way probabilities, 

Pl(«k.»'f.Of}-Po{Wk.Py.Of} 

X 

2 Po{«k."f.O/} 

y Aii 

•’(>'/. 0|) 

(49) 


As the procedure is repeated, conver- 
gence occurs rapidly, and values stabilize 
within a small number of iterations (De- 
miug and Stephan, 1940). The method 
always converges toward the unique set 
of maximum likelihood estimates and can 
be used with any set of starting values; 
further, estimates may be determined to 
any preset level of accuracy (Bishop et 
al., 1975, p. 83). 

In a typical remote sensing applica- 
tion, a stratified random sample is col- 
lected which estimates the two condi- 
tional probability sets P{uji| Py} and 
P(fa)it|Of}. In addition, probabilities of the 
form P(«t,Of) are obtained by process- 
ing registered digital images of maps 
showing the spatial distributions of p and 
o. By noting that 

2i’(»'f.Of} (52) 

/-I 

and using the Law of Conditional 
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TABLE 4 CooditloiMJ Prolwbditiai for Numatical Eiompto 







0|) 


w 

•-I 



»i 

0% 

Oj 

*»1 

0.6 

0J( 

0.2 

OJ 

OB 

OB 


OJ 

0.2 

0.4 

0.4 

0.1 

OB 

“J 

0.1 

OJ 

0.4 

0.1 

0.1 

OB 


Probability, 

( 53 ) 

the joint probabilities P(<Oj^, i'^} and 
P{wi^, 0 {} can easily be calciilated from 
the sets of conditional probabilities 
P{«tlr,) and P{wJo,}. 

Numerical example 

A simple numerical example will il- 
lustrate the iterative method. Table 4 
presents a set of one-way conditional 
probabilities for an e.xample of three 
classes Wj, it— 1,2,3 with two condi- 
tioning variables v^, /— 1,2,3, and O/, /— 
1,2,3. .Although simple decimal values 
are assumed here for ease of computa- 


TABLE S Joint ProbabiUties for Numerical Example 


*’{'(. 0|) 

9 

Ol 

Ot 




0.06 

0.12 

0.14 

0B4 


0.07 

0.09 

0.12 

09« 


0.16 

0.10 

0.12 

0.38 

rio,) 

OBI 

OBI 

0B8 



tion, these values would normally be ob- 
tained by prior random stratified sam- 
pling. Also required are the joint proba- 
bilities P{i'^, O/} (Table 5). Tables 6 and 7 
show how values for P{(Oj^, and 
P{( 0 jt, 0 () are calculated according to ex- 
pressions (52) and (53). 

Table 8 presents the results of the first 
two iterations in fitting the no two-way 
interaction models to these data. Using 
the criterion of no further change in any 
P{«fc, Oj) of greater than 10"*, con- 
vergence is reached at iteration 23. Al- 
though these probabilities can be used 
directly in decision rule R^, it may be 
easier to examine the values as condi- 
tional probabilities as used in R'^. These 
values are shown in the last column of 
the table. 

Time-Sequential Classification 

If a classification carried out at a earlier 
time is viewed as an external condi- 
tioning variable, then the mechanism of 
prior probabilities can be used to make 


TABLE 6 Calculation of Joint Two-Way Probabilities for Numerical Example 


"t 



P{«t,r 

} F{«-*ks) 


P{^) 


P{w».i-t 



F{»,} 


P{«S.»-3) 


0.6 X 

0B4 - 

0204 

0.5 

X 

028 

- 

0.140 

02 

X 

0B8 

- 

0.076 

“i 

OB X 

0B4 - 

0.102 

02 

X 

028 

m 

0.066 

0.4 

X 

0B8 

- 

0.152 

"3 

0.1 X 

0B4 - 

0.034 

OB 

X 

028 

- 

0.084 

0.4 

X 

0B8 

- 

0.152 

TABLE 7 Calculation of Joint Two-Way Probabilities for Numerical Example 


P{«»|0,) 


P{U»,0| 



P{Ot) 


*•{«». Of) Piwslos) 





"i 

OB X 

OBI - 

0.155 

0.8 

X 

OBI 

m 

0.248 

02 

X 

0B8 

- 

0.076 

“s 

0.4 X 

OBI - 

0.124 

0.1 

X 

OBI 

- 

0.031 

OB 

X 

0B8 

• 

0.114 


0.1 X 

OBI - 

0.031 

0.1 

X 

OBI 

- 

0.031 

0.5 

X 

0B8 

- 

0.190 
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the outcome of a classification contingent 
on the earlier classification. This applica- 
tion is best clarified by an example. Con- 
sider an agricultural classification with 
four field types: rice, cotton, orchard, 
and fallow. An early spring classification 
reveals the presence of young rice with 
high accuracy, but at that time cotton 
cannot be distinguished from fallow 
fields. Orchards are easily distinguished 
from field crops at any time of year. By 
early summer, many fields which classi- 
fied as fallow are likely to be in cotton; 
however, fields classified as rice are still 
likely to be rice. Orchards will remain 
unchanged in areal extent. 

Data from prior years are collected to 
quantify these expected changes, and a 
transition probability matrix is devised 
which describes the changes in classifica- 
tion expected during the early spring- 
early summer period (Table 9). In this 
example, spring classification shows 30% 
of the observations to be rice; by summer, 
90% of these observations are expected 
to continue as rice, with 10% returning to 
fallow because of crop failure ot lack of 
irrigation. Twenty percent of the spring 
observations are orchards, and all of these 
are expected to remam in orchard 
through early sunrimer. Fallow fields, 
constituting 50% observations, are most 
likely to become cotton (probability* 
0.7), with a few becoming rice, orchard. 
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or remaining fallow (probabilities *0.1). 
Since no observations are classified as 
cotton in spring, no transition probabili- 
ties are needed for that class. This use of 
transition probabilities was suggested as 
early as 1967 by Simonett et al. 

The transition probability matrix can 
also be recogized as a matrix of condi- 
tional probabilities which de- 

scribe the probability that an observation 
will fall into summer class ui,, given that 
the observation falls into spring class Pf. 
Thus, the early summer classification can 
be made contingent on the early spring 
classification through the prior probabil- 
ity mechanisms discussed earlier, and any 
possible confusion between cotton and 
rice in summer will be resolved by the 
spring classification. It is also interesting 
to note that the transition probability 
matrix is actually a square stochastic ma- 
trix, and therefore the situation is equiva- 
lent to a simple one-step Markov process. 
Under these conditions, the expected 
posterior probabilities P{<V|,} are 

PK)- (54) 

/-I 

In a rec'ent paper, Swain (1978) has 
''arried this approach a step further, in- 
corporating in the decision rule both 


TABLE 9 Agricultural Time-Sequential ClatrifiraHon Ejumple 



Spiiinc 

Clam 





“1 

"t 




Ricz 

COTTOM 

OncatAiio 

Faixow 

'1 

Rice 

OJ 

0.9 

0.0 

0.0 

0.1 


Cotton 

0.C 

— 

— 





Orchard 

0.2 

0.0 

0.0 

1.0 

0.0 

»4 

Fallow 

0.5 

0.1 

0.7 

0.2 

0.1 


P{w,) 


0.32 

OJS 

0.25 

0.08 
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measurement vectors j and X, j taken 
from times f j and t 2 at point i in space. 
In contrast, the approach described above 
uses X| I to predict at time t, and then 
uses X, 2 and Py to make the classification 
decision at time tg. Swain’s decision rule, 
in the notation of this paper, becomes 

Rg: Choose k to maximize 

f’s.kiXM.X,.*)- 

SP{x,,|p,)p{x..2|w*}P{wJi',}P{»',}. 

(55) 

Swain has termed this rule the "cascade 
classifier.” 

Swain’s approach has the advantage of 
using full information about the distances 
of the measurement vectors X^ i and X( 2 
from class means; however, as Swain 
notes, there is no way to make the first 
observation set dominate the second. 
When the transition probability matrix 
goes to an identity matrix, the classifica- 
tion rule becomes: 

(56) 

and the two observations becomes equally 
weighted. Decision rule A 3 does allow 
the first observation to dominate; here, 
an identity transition probability matrix 
will preserve the first classification com- 
pletely. On the other hand, A 3 assiunes 
that the prior classification is perfectly 
correct, and any errors in the prior clas- 
sification will also be preserved to an 
extent controlled by the transition proba- 
bilities. Thus, both approaches are rele- 


vant, depending on the classification task 
at hand. 

Remote Sensing Example 

The preceding numerical examples 
have demonstrated the application of 
prior probabilities in maximum likelihood 
classification in a computational context; 
a real example drawn from remote sens- 
ing should serve to further enhance un- 
derstanding in an operational context. 
This example (Strahler et al., 1978) is 
drawn fi-om a problem involving classifi- 
cation of natural vegetation in a heavily 
forested area of northern California. In 
the classification, spectral data are used 
to define species-specific timber types, 
and elevation and slope aspect are used 
as collateral data channels to improve 
classification accuracy. 

The area selected for application of 
the classification techniques described 
above is referred to as the Doggett Creek 
study area, comprising about 220 km^ of 
private and publicly-owned forest land in 
northern California near the town of 
Klamath River. Located within the 
Siskiyou Mountains, elevations in the area 
range from 500 m at the Klamath River, 
which crosses the southern portion, to 
2065 m near Dty Lake Lookout on an 
unnamed summit. A well developed net- 
work of logging roads and trails is pre- 
sent, provi^ng relatively easy access to 
nearly all of the area by road or foot. 

A wide variety of distinctive vegeta- 
tion types is present in the area. Life-form 
classes include alpine meadow, fir park, 
pasture, cropland, and burned reforested 
areas. Forest vegetation includes, from 
high elevation to low elevation, such 
types as red fir, white fir, douglas fir- 
ponderosa pine- incense cedar, pine- oak. 
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and oak-chapparal. Thus, the topo- 
graphic and vegetational characteristics 
of the area are well differentiated. 

After a review of available Landsat 
frames which included the Doggett Creek 
area, two were selected for analysis: 4 
July 1973, and 15 October 1974. The two 
frames were obtained as computer com- 
patible tapes from the EROS Data 
Center, Sioux Falls, SD and then refor- 
matted and precision rectified to 
sinusoidal projections. Pixel size was con- 
verted to 80 x 80 m in the rectification 
process to facilitate film-writer playback. 
Using the July image as a base, the Oc- 
tober frame was registered to within a 
half-pixel error using seven control points. 
In this process, the October image was 
resampled to conform with the July image 
using a cubic spline convolution algo- 
rithm. Figure 1 presents an image of the 
study area using Landsat band 5 (0.6- 0.7 
(im) from the July frame. 

Also registered to the July image was a 
terrain image derived from the U.S. Geo- 
logical Survey 1:250,000 digital terrain 
tape for the Weed, CA, quadrangle. In 
the registration process, the image was 
converted to 80x80 m pixel size, and 
stretched to yield a full range of gray-tone 
values. Slope and aspect images were 
generated directly from the registered 
elevation data using a least-squares algo- 
rithm which fits a plane to each pixel and 
its four nearest neighbors. 

The slope aspect image consisted ini- 
tially of gray-tone densities between 0 
(black) and 255 (white) which indicated 
the azimuth of slope orientation, ranging 
clockwise from 0“ to 359“. These values 
were then transformed by a cosine func- 
tion proposed by Hartung and Lloyd 
(1969). Since northeast slopes present the 
most favorable growing environment, and 


southwest slopes the least hivorable, with 
northwest and southeast slopes of neutral 
character, the density tones of azimuths 
were rescaled with 3 representing due 
southwest and 255 representing due 
northeast. (Values of 0, 1, and 2 were 
reserved for special codings.) Neutral 
slopes, oriented northwest or southeast, 
thus received density tones near 127. The 
function also corrected automatically for 
the 12“ skew of the Landsat image. For 
processing as collateral data channels, 
both elevation and transformed aspect 
were converted to three-state variables: 
elevation to low, middle, and high; and 
aspect to southwest, neutral (southeast or 
northwest) and northeast. Figure 2 shows 
elevation and aspect images as well as 
their three-state versions. 

Following an initial reconnaissance of 
the area, thirteen species-specific forest 
cover classes were selected as represent- 
ing the range of cover tvpes within the 
study area. These classes were defined by 
a set of 93 training sites ranging in size 
from approximately twenty to one 
hundred pixels. Further processing re- 
vealed the presence of several subtypes 
within most of the forest cover classes. 
For example, open-canopy douglas fir 
training sites were divided into two sub- 
types. Such subtypes were also defined 
for hardwood, white fir, douglas fir, 
sparce, and grass and shrub cover classes. 
Throughout the classification procedure, 
these subtypes were kept separate, join- 
ing together only in the final classifica- 
tion map. 

In order to obtain estimates of prior 
probabilities for the forest cover class 
types, one hundred points were randomly 
selected from a grid covering the Dog- 
gett Creek study area by drawing coordi- 
nates from a random number table. At 
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each of these points, the forest cover 
class was determined either by interpre- 
tation of 1 : 15,840 color aerial photogra- 
phy, or by actual field visit. Of the 100 
points, 15 were discarded because they 
fell (1) in locations which were inaccessa- 
ble in the time available; or (2) outside 
the area covered by 1 : 15,840 air photos 
(and therefore could not be accurately 
located on either the Landsat frame or 
on the ground). At each point, the eleva- 
tion and aspect class was also recorded, 
thus allowing type counts to be cross 
tabulated according to elevation and 
aspect. 

From this sample of 85 points, three 
sets of probabilities were prepared. The 
first of these recorded the unconditional 
prior probabilities of the forest cover 
types — that is, their proportional repre- 
sentation within the entire study area. 
The second and third sets of probabilities 
aggregated the points according to eleva- 
tion and aspect classes, and were used to 
estimate three sets of probabilities for 
each topographic parameter (low, mid- 
dle, and high for elevation, and north- 
east. neutral and southwest for aspect). 
Table 10 shows how the classes were 
defined, and describes the number of 
points falling into each. 

These estimates of probabilities lack 
precision because the number of sample 
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points is small; with 85 samples distrib- 
uted across 13 c*over types, the calculated 
probabilities are more likely to indicate 
adequately the rank order of the magni- 
tudes rather than the true values of the 
magnitudes themselves. However, under 
constraints of field time and e.xpense, it 
was not possible to prepare a larger 
dataset for this particular trial. Consider- 
ing the sensitivity of the classification to 
extreme probability values, future work 
should estimate these probability sets to 
a higher degree of accuracy. 

This dataset was also used to estimate 
classification accuracies. By recording the 
pixel location of each of the sample 
points, the cover typ>e as determined on 
the ground could be compared with the 
cover type as classified on the Landsat 
image. Because of uncertainties in locat- 
ing each pixel on the 1 : 15,840 air pho- 
tos, it was necessary to specify alternate 
acceptable classifications for each point. 
For e.xample, a pixel falling into a stand 
identified on the ground as douglas fir, 
open canopy, might fall almost entirely 
on a clearing, and thus be classified as 
grass/ shrub, or sparse if containing a 
few' canopy trees. In such a case, the clas- 
sification was termed correct. On the 
other hand, classifications such as 
hardwood, alpine meadow, or red fir 
forest would be an obvious error in a 


T.\BLE 10 

cHevation and .\spect Class Definitions 


Code 

DEriNTTION 

Point Count 

Elevation 



Low 

< 1067 m 

45 

Middle 

1088-1524 m 

-26 

Hii(h 

> 1525 m 

14 

Aspect 

Northeast 

ilTe-- 112.5° 

26 

Neutral 

122.6°- 157.5°. 292.6°-3375° 

25 

Southwest 

157 6° -'292 .5° 

34 
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doiiglas fir stand. Here again, estimated 
accuracies arc influenced by the limited 
size of the sample. 

Note tliat the field data are used to 
produce a classification which is then 
assessed for accuracy by reference to the 
same data. Separate samples would 
clearly l>e more desirable. The d*‘<'ision 
to u.se the same set of samples c- 
tennine accuracy that was u.sed to esti- 
mate the probability sets was. again, in- 
fluenced by available fieKl time and travel 
funds. However, the accuracies are 
greatly dependent on the spatial location 
of the data jwints; only in the aggregate 
does each data jx)int influeiK'c the classi- 
fication. lluis. we would exfxrt the ac- 
curacies tt) refltx.’t only a slight positive 
bias prixluced bv this t'twt-retlucing 
strategs’. 

.\lthough several different chissifica- 
tions were carrietl out. c)nly three iU'e of 
imjxirtance here. The first u.sixl spectral 
data onlv, and assunuxl tnjual prior 
probabilities; this cla.ssification \ieldtHl an 
acciuacy of 58% (Fig. 3). In the second, 
three .sets of prior probabilities for the 
forest types were usetl. each t'ontingent 
on one of the thri*e elevation states (Ta- 
ble 10. Fig. 4). The cla.ssification Mjftware 
was nuxlifitnl to use a table kxik-up of 
prior probabilities with elevation chess ;es 
one index into the prior prolvibilitv table. 
This techni(jue increa.stxl clav;ificatioii 
accuracy from 58 to 71%. 

Tlie thiril clavsification useil two sets 
of pnor probabilities contingent on eleva- 
tion and as|x*ct. analogous to 
and P{u)j|o,) ;n the prtveding section. 
Software then s\stematicallv sanipltHl the 
registenxl elevation cla.vs aiul terrain cla.vs 
images to neld the joint proUibilities of 
elevation anil .usptvt classt‘s, analogous to 
P(r,. 0 ,). Tlie iterative algonthm de- 


scrilxxl earlier was then applied to esti- 
mate the set of c*onditional probabilities 
for forest cover clavses contingent on all 
combinations of elevation and as^iec't 
classes. Classification using these esti- 
mated prolxibilities contingent on lx)th 
elevation and aspect yielded an accuracy 
of 77%. an improvement over that olv 
servetl for elevation alone (Fig. 5). Tims, 
this example demonstrates how prior 
prol>abilities can lx* used to merge I'on- 
tinuous variables of multispei'tral bright- 
ness with discrete variables of elevation 
and aspect clavs to improve classification 
accuracies. 

Discussion 

.■\s noteil earlier, extri-me values of 
prior probabilities can force a clas.sifica- 
tion to lx* made es,sentially without in- 
fonuation concerning the observation it- 
self. NMien priors are iH]ual. however, 
they have no effect. The cla.ssifier. then, 
continuoiislv trades off the role of the 
multivariate infonnation for the role of 
prior infonnation. de^XMiding on Ixith the 
mapiitude of the distance of the multi- 
variate observation vectirr from the class 
mean vector anil the ratios of the jxirticu- 
lar prior proliabilities iiivolvixl m the de- 
cision. \Mien the exjx'riiueiital ilesigii id- 
lows the priors to Ix' iletermineil bv ex- 
ternal iMiulitioning variables, the effivt is 
to classify bastnl on miiltii iU'iate infoniia- 
tion when the jx>ssible classes are not 
jxirticularlv influenciHl by the condition- 
ing variable and to classiK’ baseil on prior 
infonnation when multivariate data are 
eijuivix'al or some i’lasst*s are much more 
or less likelv than others. Tluis. m the 
forest classibcation example, terrain in- 
fonnation st*r\t*il to differentiate sjXH'ies- 
sjxx'ihc iDver tv{x*s (e.g.. rixl hr. white 
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FICLUE 3. Classification map based on spectral data only; accuracy, 58%. 


fir), whereas spectral infomiation differ- 
entiated life form classes (e.R.. grass- 
shrub, hardwood). 

•Another important point concerns the 
dependence of die prior probabilities on 
the scene itself; the relative areas of the 
classes in the output scene must lie accu- 
rately estimated. If the output sc'ene shifts 
in area, then the priors must be changed. 
The clas.sification cannot be extended to 
a new area without reestimating the prior 
probabilities. Thus, it would lie ap- 


propriate to use county crop acreage val- 
ues to set prior probabilities only when 
the entire area of the county, no more, 
no less, is to be classified. In the case of 
one or two external conditioning varia- 
bles detennining the appropriate set of 
priors, both the joint probabilities 
P{i’^,o,) and the conditional probabilities 
P{ujj|i’j} and P(a}^|o,) must truly repre- 
sent the area to lie classified, for, taken 
together, they determine the prior proba- 
bility values actually used in computa- 
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FIGURE 4. CloMification m«p b«ed on spectral d«U, with elevnbon included by varying prior probefadities. 


Key to map symbob is iiKluded in Fig. 3. Accuracy is 71^. 
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tion. In some applications, it may be 
possible to extend the conditionals to a 
new scene in which only the joint proba- 
bilities of the variables change — for ex- 
ample, a forest cover classification with 
elevation and aspect as conditioning vari- 
ables which is extended from one uni- 
form area to another. The new area will 
have different joint probabilities P(i'^, O;} 
(and derived priors P{p^} and P{ 0 /}), but 
it mi^t be reasonable to assume that the 
conditional probabilities are ecologically 
based and remain consistent from one 
area to the next. 

It should be noted that collateral infor- 
mation cannot be incorporated through 
the prior probability mechanism without 
the collection of data to estimate the 
priors and/or conditionals. If the col- 
lateral data are likely to be unrelated to 
the multivariate data and are expected to 
influence strongly the prior probabilities 
of the classes, then such estimating costs 
will be justified, for significant improve- 
ments in accuracy should result. Ulti- 
mately, it is up to the user to balance the 
costs of acquiring such information with 
the value of the expected payoffs in ac- 
curacy which are anticipated. 

The logic which culminates in decision 
rules Rj ^3 assumes that the mean 
vector and dispersion matrix for a class 
are not affected by the external condi- 
tioning variable [see expression (35)] -in 
the remote sensing case, this means that 
the signatures are invariant. In some ap- 
plications, this assumption may not be 
warranted. An example would be an 
agricultural crop classification with soil 
type as a collateral variable. Here the 
signature of the soil itself, at least in the 
earlier states of crop development, will 
influence the crop signature. In this 
situation, there is no recourse but to 


spectrally characterize each combination 
of crop and soil type so that probabilities 
of the form can be calcu- 

lated. Following the logic of expressions 
(5) through (11), it is possible to show 
that 


2p{x,|w„p,) 

lc-1 

(57) 

which could be made the basis of a deci- 
sion rule related to ^ 2 * 

A final point worthy of discussion con- 
cerns modeling of joint probabilities, sug- 
gested in an earUer section to reduce the 
need for more extensive ground sam- 
pling. The model presented is but one 
example of a large variety of techniques 
by which collateral data can be used to 
predict the spatial distribution of classes 
in an output image. Discrete and con- 
tinuous collateral variables can be merged 
either using empirical techniques in- 
cluding multiple regression, logit analysis, 
discriminant analvsis, analysis of covari- 
ance, and contingency table analysis, or 
by constructing more functional models 
which model the spatial processes actu- 
ally occurring in a deterministic way. 
When such models are constructed and 
interfaced with remotely sensed data, the 
result may be extremely powerful, both 
for the ability to accurately predict a 
spatial pattern and for the understanding 
of the complex system which produces it. 

Conclusions 

The use of prior probabilities in maxi- 
mum likelihood classification allows; 

(1). the incorporation into the classifica- 
tion of prior knowledge concerning 
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the frequencies of output classes 
which are expected in the area to 
be classified: 

(2) . the merging of one or more discrete 

collateral datasets into the classifi- 
cation process through the use of 
multiple prior probability sets de- 
scribing the expected class distribu- 
tion for each combination of dis- 
crete collateral variables; and 

(3) . the use of time-sequential informa- 

tion in making the outcome of a 
later classification contingent on an 
earlier classification. 

Thus, prior probabilities can be a 
powerful and effective aid to improving 
classification accuracy and modeling the 
behavior of spatial systems. 
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XASA contract SAS-9-15509, and the 
California Institute of Technology’s Presi- 
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contract NAS-7- 100. I am greatly 
iruiebted to D. S. Sirnonett. P. H. Swain. 
R. M. Haralick, and W. Wigton for criti- 
cal review of the tnanuscript. 
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ABSTRACT 

A mnber of exlstlns statistical techniques can be used to aerg/t spec- 
tral image data with collateral information, including regression, ANOVA, 
llANOVA, ANOOVA, MANOOVA, discrlminait analysis, maxinun likelihood classi- 
ficatim with or without prior probabilities, contingency table analysis, 
and logit nndeling. The choice of an appropriate tei^ique depends upon 
the nature of input and output variables — continuous, discrete, or cate- 
gorical -• and the a ppropriate model — paranctric or nanparanetiir . 

Logit nndeling is a very versatile technique which is well adapted to 
remote sensing application. The logit, vbich is the natural logarithm of 
ai odds ratio for two states of an output categorical variable, is predicted 
by a linear or curvilinear function of continuous or categorical injxit vari- 
ables. Since the log.it models probabilities or proportions, it can be used 
directly as a classifier or indirectly as an estimator of prior probabili- 
ties for conventional maxinun likelihood classification with prior probabil- 
ities. The logit model is nonparametric, a feat’Te which makes it highly 
desirable idien used to merge disparate types of collateral data. The dis- 
advantages of the logit model are that more calibration (training) data are 
required to fit the model, and that fitting requires an Iterative nonlinear 
optimization procedure. A logit model was devised aid fitted to forest 
species conpositicnal data for northern California, predicting the propor- 
tion of tiiiber volisne in each of five species at each pixel biued on regis- 
tered terrain data quantifying elevation and slope aspect. 

Another versatile tool is maxinun likelihood classification with prior 
probabilities. By making prior prob-itiillties conditional on a collateral 
data channel, the information contained within the channel can be conveyed 
to the naxlnun likelihood algprltlim. An exanple is in land use, in vblch 
a previous classification and oi externally devised transition probability 
matrix are used together with new image data to produce an updated classi- 
fication consistent with the observed pattern of change. This technique 
has relevoice for monitoring urban expansion and the inpact of forest 
clearing in developing nations. 


1. INTRODUCTION 

Viewed in a broad context, the problem of coobining image data and spatial collateral data 
into a predicted Oi'tput map or image is actually a problem of conbining continuous and categori- 
cal variables in a nrxleling fronework capable of producing continuous or categprical outputs. 

At the University of California, Santa Barbara, N/^-supported researdi (grant NSO-2377) is cur- 
rently underway to identify existing nndels and procedures for spatial modeling, and to apply 
them to selected datasets to demonstrate their applicdiility in a remote sensing situation. 

Collateral data, here defined as preexisting spatial information in the form of a map, pro- 
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cessed image, or set of observatlcns at grid coordinate locations, can be confined with Landsat 
or other renotely sensed digital imagery to enhance classification accxracy or to construct 
models which (>re^t spatial patterns of groind phenonena. Collateral Information can be either 
continuous or categorical in nature. Exanples arc ele^mtlan, slope, or aspect channels obtained 
from a digital terrain model (continuous); and rock type, soil ty^, crop type, or land use 
(categorical). Image data, with lAilch collateral Information are to be confcln^, arc typically 
continuous in nature, althouf^ values may be quantized Into discrete gray tone levels for data 
processing applications. Desired outputs msy also be either categorical or continuous. Any type 
of classification constitutes a discrete or jategorlcal output, whereas such outputs as percent 
bare groind, tijiijer volisne, soil loss, or forage cover arc continuous In character. 

Figure 1 presents appropriate techniques for the merging of continuous, categoiical and 
mixed (&>th continuous and categorical) datasets to produce either continuous or categorical out- 
puts. Continuous o'Jtput models. Including regression, analysis of variance, and analysis of co- 
variance, are all msthematlcally related and based on least squares algebra. Categorical output 
models arc more diverse and Include varlan:« maximizing techniques such as discriminant analysis 
as well as the nonparametrlc methods of contingency table analysis and logit modeling. Maxliiun 
likelihood classification may be viewed as a special case of logit modeli^ In whldi the Input 
variables are assimed to be normally distributed. Nonparonetrlc and maximun likelihood techni- 
ques, because they produce probabilities of classification as an output, also have the advantage 
that they cci be adjusted for prior probabilities. Because many present ranote sensing applica- 
tions call for categorical classification, these latter methods are probably most useful in com- 
bining continuous image data with categorical and continuous types of collateral data. 

Demonstrations of a selected set of these techniques are plamed and uider current develop- 
ment; their c u rr e nt status is discussed In following sections. Several of these exoiples have 
inportant inpllcatlons for remote sensing In developing nations. In one application, logit 
modeling Is used to fit a model which describes the probability of occurrence of various forest 
species given elevation and slope aspect values obtained from a registered digital terrain model. 
0^ obtained, these probabilities can be used as prior probabilities in a maxlmn likelihood 
classification of a Landsat image with registered terrain data for natural vegetation inlts. 

This tedmlque could facilitate the accurate identification of forest types In conplex tropical 
upland environments. 

In another exanple, land use classification for change detection nenitoring Is inproved by 
a contingency table-analytic technique which quantifies the probabilities of change for each 
land use type during the fixed time inteival. Ihis tectmique has in^xrrtant inplications for many 
developing nations, especially in Central America, where urban expansion Is impacting agricul- 
tural land, and forest clearlim for agriculture is inpacting large natural envlromcnts . Addi- 
tional exaiples are being developed, focused on Landsat and collateral datasets obtained for an 
area of Ventura County, California. 


2. STATISTICAL TECHNIQUES 

Figure 1 identifies a set of statistical techniques which are relevant for conbinlng colla- 
teral data in the context of Landsat modeling and classification. The technlqes can be seen as 
a double level hierarchy, ranging from continuously measured independent variables in the first 
col am to categorically measured independent variables in the last colunn. In the first row, 
the dependent variables are are continuous In nature, and in the second row the dependent variables 
are categorical. Continuous variables are measured on Interval or ratio scales, whereas categori- 
cal verifies are measured at nominal or ordinal scales. Categorical variables can be of thrM 
types- 

(a) dlchotonous (eg., presence or absence, yes or no) 

(b) unordered polychotaiDU? (e.g. , land uses; agriculture, urban, forest, etc.) 

(c) ordered polvchotonous (e.g., low runoff, median, and hl^ nnoff) 

The statistical techniques in the first row have been thoroughly docunented and are cormonlv used 
in social science (Blalodc, 1972, Graybill, 1%1; ibrrisen, 1967; Winer. 1971) However, 
there has been relatively Uttle work in the area of applying these approaches to rorole btnsing.. 
The second row of the figure describes techniques which are generally less well loiown, but also 


1010 


ORIGINAL PAC£ IS 
OF POOR QUALITY 


inc'.uit inudflun likelihood cel«slficatlan as camnonly carried out In rtRocc scnalnp,. The rcnaln- 
Ing portiona of this section discuss the theory «k 1 ranote sensinp application of the tedvilqucs 
Idmcifled in Figure 1. 

2.1 (XNTINUDUS RESPONSE VARIABLE 

The etatlstical techniques of the first row of Figure 1 share one thing in comnun — they 
are all different form of the basic linear regression nodel. Consequently, the theory and 
application of the different types of regression in row one will be very sindlar. Cell (a) %flll 
be examined in detail, aid exc^ where explicitly specified, the analysis can be extended to 
cells (b) Old (c). 

2.1.1 CELL (a) -- GOtn'INUOUS EXPLANATORY VARIABLES. The nnst coniDnly used method In this cell 
is regression, which predicts a continuously measured response variable from ccxitinuously mnas- 
ured explanatory variables. The model (In vector notation) can be written: 

y - M + e 

where y Is the vector of observed dependent variables, 8 is the vector of imkniMn parameters, X 
Is the'vector of observed independent variables, and c is the vector of errors. (Table I des-* 
crltes the synbols used in this paper.) Typically, the vector 8 Is ixiknown, and mist be estl* 
nated from a dataset for vhlch both response and explanatcnry variables ere observed. This estl- 
nation Is done by the process of ordinary least scpiares regression (OLS). OLS repression finds 
the slope aid the Intercept of a line runing thniugh the (lata vhlch minimizes the variance of 
t(y-!)^ , or the sini of tne squared differences between the observed Y and the predicted y. The 
vector of predicted Y value Is calculated by; 


t - ftX 

The reacssloi Is acconpllshed by defining the quantity Q equal to I (y - y) taking. the first 
partial derivatives of Q with respect to the values In the vector 8 and setting tlwse partlals 
ec]ual to zero. By ikf Initial of a parltal derivative, a ndnimm has been foisid. 

The best statistic fer measuring the strength of the regression is There are several 
ways of calculating R^ , but the following Is ccxiceptually the sinplest. The residual sun of 
sqiares (RSS) or the total anoisit of variance in the model not explained by the regression Is 
calculate by; 

n 

RRs - r (y. - y.)* 
f- I * ‘ 

The RSS, vhen divided by the total corrected sun of sepjares In Y (written TSS), gives the propor 
tlon of unexplained variance to total variance. R^ . or the proportion of explained '.’ariance. Is 
obtained by siiitractlng this ratio fron one; 

R? - 1 . 252 

TSS 

One exanple of how regression ccxild be used within the context of merp.ing Landsat and col- 
lateral data Is bionass modeling, ilie regressim exmple uied for cell (a) Is restricted to 
continuously measured independent variables. For slncllclty, only one spectral channel and one 
collateral data scxirce are used. Adding ocher chameW and other cx>llateral data sources Is a 
St ralglic forward procedure. A basic linear model could be- 

Rf " h * 8| (W.'^.'-’^) + 8?(r«/*i^) 

where P . Is Che predicted biomass for pixel {. Is observed nulclspectral scanner data (as 
single band, or nultlbarxl ratio or traisform) for pixel i. rain . Is observed rainfall cm pixel 
{, and the vector of betas (A 0 ... 87 ) are the estimated regressl^ parameters. 

Regression models are applied as a two stage prcxress. First, the model nust be cuilibrated 
(estimate the vector of 8 's) by regressing the (Observed independent variables on the observed 
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dependent variable. In thia exanple, bicmass data are required from a sufficient nafaer of 
pixels to give a representative s^le of the Landsat imgp to be nrdeled. The lucatlons of 
data points are then "rubber-sheeted" to a geometrically corrected .aKlaat Image, and the linked 
biomass and HSS data is directly accessed a statistical software package, sudi as the Statis- 
tical Analysis System (SAS), to calculate the betas and . Secondly, if is statistically 
significant (i.e., the calculated betas explain a significant portion ox the total variance in 
r), the model is extended as a predictor to other plwls where the dependent variable has not 
been observeo. This, in effect, constitutes a model of biomass predicted by surface reflectance 
and measured ra^’^fall. 

2.1.2 CELL (b) — .<(IX£D EXPLANATORY VARIABLES. This cell includes conventional regression 

models which are similar to those of cell (a) but also include a mixture of continuous and cate- 
gorical explanatory variables. Such ..xxlels are ccnnDn in social science research, the categori- 

cal explanatory variables often being termed "dunny" variables. It coi be shown that the more 
familiar statistical test Analysis of Covariance (ANOOVA) is a straightforward extension of OLS 
regress Icn with diimw variables (variables that assune values of 1 or 0 depending on the presence 
or absence of the qualitative variable being measured). 

As an exoiple, the model used in cell (a) will be extended to cell (b). In this cell we 
are able to include data mea.sured categorically — for exanple, soil type %diich cmi be observed 
in two states, referred to as class 1 <md class 2. The model now beconcs; 

P. - 6o + + 62 (.ritin.) + BP/, + Bi,/*,-? 

where the biomass for pixel t is pr^cted by the variables used in cell (a) in coibinatlon vrith 

the duimy variable terms B4r. and If the observed soil type for pixel t is class 1, 

then the categ^arical beta thkt will be' used is Bj. vhereas if the soil type is class 2, thai Bm 
will be used. This is accoiplished by defining P{\ equal to 1 if the soil type in pixel t is 
class I and 0 otherwise and by defining T,*.. equal to 1 if the soil type is class 2 and 0 otherwise. 
It is a relatively sinple task to expand this model to include several (polychotcmous) soil types 
or to include other categr'rical data. 

VIhen there is more than one Interval scale dependent \«riable, the ncdel is called MANOOVA, 
or Multivariate Analysis of CovarlaKe. In this mxlel, all of the independent variables are 
repressed against ea^ of the dependent variables, with .separate ff^s and F ratix» calculated for 
each depend^t variable. An exanple of this is: 

^..S{ " Bo + Bi(W.‘>V',‘) + B?(>u»'n,-) + B.,/’,’| + Bi/*,'? 

where Pf is predictei binnass for pixel t . S{ Is predicted soil loss for pixel », and the other 
variables remain as in the preceding exanple. t1A(KX1VA con be seen as a device to test more than 
ine ANCUVA moUel in die sane statistical analysis. 

2.1.3 CELL (c) — CATH50RICAL EWTANAIXIRY VARIABLES. The extenetd regressitn nndel examined in 
ceil (b) is also applicable here. When the explanatory variables are all mBasured on the ordinal 
or nonlnal .scales, the model Is usually called Analysis of Variance (ANCIVA). All that is neces- 
sary to mive from cell (b) to (c) is to exclude fron the arralysis all cxplmuitary varl^les tliat 
are measured on the interval or ratio level. As an exoiple of ANOVA, the bionoss nudel would be 
written: 

/>,• - ^0 + »\Pii Mm + 

Here, Bi ami B^ refer to t>xi different soil types aixl atxl B4 refer to rainfall that has been 
categorized into two lerels (high low) . It vxxild be possible to categorize MSS data for use 
in such a nuHlel, hut there is cixrsidcrable information loss. Consequently, the ANCVA or MAN7VA 
^Multivariate Analysis of Variance) mxlel is rx>t likely to be ns useful as the ANCXIVA or MANCTB/A 
mxlel. 

2.2 CATttXTRICAl. RESPONSE VARL\BLE 

The first three cells tuive dealt with Landsat ItlS data arxl contlreiuus dependent collateral 
variables as predictors in varUxis statistical mxlels of physical phenemnu. The last three 
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cells, in which the dependent variable is categorical, iipen « 4 > nev vnys of utilizing collateral 
data within the Landsat structure. With categorical data and with the appropriate statistical 
techniques (see Kigjure 1) it is possible to: 

1) nndel physical and social phcniiFna that are best represented in discrete steps -- lev, 
nediun, high (soil erusioi, fire hazard, bienuss, housing quality, nunicipal services, 
etc.): 

2) cl-issify the dependent variable into nednal groupings (land use, vegetation aananlty 
type, etc.); 

3) create predicted probabilities that the depeitdcnt \ariable will assune particular cate- 
gories and use throe probabilities to clns.slfy an iimge directly or use them in con,|mc- 
tiun with other data (usually 1K>) in a Havselan maxiinin llkelilKxxi classifier. 

Ihlike the first row of Figure 1, the sectsul nv includes five different statistical techni- 
ques. For purposes of nudeling with Landsat irtder the constraints of the second iw, tiaximm 
Likelihixid Classification (MLC) with Prior PrvibaliilltieK is the n»>st iitportmt method. "Moxi- 
nun llkelihoodT* is a statistical property of an estimator, and, used in its proper way, iiiftlies 
that an estimator has the highest pniuibility of pnxkclng the data which vere itsed for calibra- 
tion. Ikvever, its use in rtsmte sensing in^lies a particular decision rule QIC) pi>s.sessing 
this proi’"rry. %Jhich is discissed heUv. CXir research has shewn that the inclusitxi of collateral 
data as prior probabilities to MX' offers a sinf>le and effective wav of cMihining (collateral and 
ranotely sensed data. 

The kev to thi> use of prior pnhahilit ic.< is the logit regresslcTt pixlel, which takes col- 
lateral data (cnitinuous or categorical) .md generates predicted probabilities. 'Diese predicted 
probabilltlrs can be imed as a direct classifier (l.e., the pixel is classified into the cate- 
gory that has the highe.st probihillty) . but the most likely iwage of these predicted probabili- 
ties Is as input to ^ IlC decision rule, in which they serve ns weights. Since the logit re- 
gression nndel is prohably the least familiar of the statistical technic|vs‘s to renvte sensing 
researdi, and since it ir applicable in everv crell in tlx> bottom nv. it will be explored with 
the imst detail. 

niscrindnant Analysis is similar in its us^ige to maximim likelilxxKl with prior pnhabilities 
hut because of its cmputatiimal complexity it has not been used often in the renote sensing cct»- 
text . Ccmsequ'ntly, it %vill he cxily briefly discussed. Chl-Scjuare Analysis, which is the tra- 
diticnal mvilysis used m cxnt ingccncy tables (all variables aro categorical) is bv definition 
not (xmpatible vrtth interval or ratio scale rcm'telv .sen.sed data. It is similar to /VNHVA. 
except that the cxitput is categorical . 

2.2.1 IMi. (d) — aWINlWlS LaanViNAlllRY VARl(\HLKS. In tlx» past ten years, anxlmm likelihcxxl 
clossificaticn has fcxnd wick* opplicaticn in the field of retaMe sensing. Rased cn nnltivariate 
nornnl distributicn theory, the MX' algin^itlm has bt'on in use for .applicat ims In the social 
.sciences since the late l%0*s. IVoviding a pmbiibillstlc mc'tlxxl for recognizing similarities 
between indivickuil nro-siirairnt s .axl predefinc*d stJBxlinls, tlx? algcrltlm fcxnd increasing use in 
the field of pattern rvcxigniticn in the folUwrtng dec.icics (Clxv, 1957; Sebestycn, l%2; Nilssin, 
l%5). In remote sensing, tlx* ck?\x*lopmi*nt of milt ispix'tral scandng technology to proAx?e 
layered milt 1. spectral digital imngi*s of land ari*a.s fnm aircraft or .spacecraft provided the 
cpporlinity to use the maxtmm likelihcxxl classifier in prcxkx'ln>* tlxmatic cla.sslficaticn maps 
of large* areas for such purpcxces as huxi iLsc*/laixl cowr ck'terminat Icn and natural cultivated 
lend ln\a*ntorv (Sc'lx*ll, 1972: R»x*ve*s et al . , 1975) 

before pn*scnt inj* a practical t*x;inple. It will lx* helpful tc* briefly reviev tlx* mullumnt ics 
of tlx* mixiimi likelilxxxl decisicn rule. In tlx* miltivariate remte sensing applicaticn, it is 
as.sirnixi tlvit ecx*li olxw*rvntlcn X (plxi*l) ccxislsts of a set of m*a.sun*m*nt s m ; v.arliiile.s (clvm- 
nels). Tbnxigji sem* external pnxx*ckirc a set of ch.serv’at icxis wliicli cx>rresfxxxl to a class is 
ickntifi(*d — that is, a sc*t of similar ob)c?cts clviractcriixxl bv a wetor c'f m*oas cn m*asuri*- 
m*nt vnriablro cnxl a vari«xx*-cnviriaiHX' mitrlx ik'scrlblng the lntt*rrolat icxishlps cminj*. tlx* 
m*asuremnt variables wlilcb are clwracterlst ic of tlx* class. Altluxigli the panmetrlc nx'cui vec- 
tor and dlsperslcn matrix for tlx* class remiin u^axwn. they are estimited bv the sanple nxvms 
.ind dispersUn imtrix as.scx'latcxl with tlx* ob)cx't .s.xq?lt* 

Miltlviu'iate ixirmil stati.stlcal tlxxirv ck*scribes the prcdvibilitv tliat an ob.st*rvat icxi X will 
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cx:air, given that it belongs to a class k, as rhe following fizictlan; 

- “P' " ‘'P 


As applied in a maxinun likelihood decision rule, the previous expression allows the calculation 
of the probability that an observation is a wenber of each of k classes. Ihe observation (pixel) 
is then assigned to the class for whidi the probability density value is greatest. Since the 
log of the probability is a raonotonlc increasing function of the probability, the decision can 
be made by conparing values for each class as calculated from the right hand side of equation. 

A simplified remote sensing classification rule using maxinun likelihood (Tatsuoka, 1971; 
Strahler, 1980) with k possible categorical classes and p channels of MSS input datasets is to 
choose die k (class) which ndnindzes 


Fi.kO^;) - ln|D> I + (Xj - mj,; 'V'Otj - n\). 

This expression is derived from the preceding one by taking the natural logarithm and deleting 
terms which are constant for all classes. 


Interval or ratio level collateral data can be incorporated as extra "logical" channels 
within this model. One successful forestry application was achieved by the creation of a tex- 
ture channel which was synthesized from Landsat Band-5 by taldng the standard deviation of den- 
sity values widdn a 3 by 3 moving window, scaling this value, associating it with the center 
pixel of the 3 by 3 wirxlw. and returning it in imacre format (as a fifth channel) . Values in 
the texture channel describe the variation in image tone within the umediate area of each pixel. 
High values are characteristic of edges and boundaries . whereas lower values describe more uni- 
form areas. This technique was shown to significantly increase classification accuracies for a 
miiber of species-specific forest cover types in northern California (Strahler, 1978, 1979). 
Strahler (1973) denixtstrated how to input collateral information in the form of elevation data 
and slope aspect (in conhination with a texture channel) as separate "logical" channels, increas- 
ing the classification accuracy by 27 percent. 


2. 2. 1.1 lx)git Regression . In extending the conventional regqression models adopted in cells 
(a), (b), and (c) to the problems of cell (d), two difficulties are encountered. (For details 
see W^ley, 1976, p. 8-9; 1977b; p. 12-13). First, a conventional regression model with a 
categorical response variable will folate the constant error variance or homoscedasticity 
assuiption. Uhile this problem does not result in biased or inconsistent paraneter estimates, 
it does result in a loss of efficiency and gives rise to serious problems if conventional infer- 
ential tests are used. Secondly, a conventional regression model with a categorical response 
variable may generate predictions which are seriously deficient. It can be shtxjn that the pre- 
dicted values of the response variable in such a model are best interpreted as predicted proba- 
bilities. The problem is that although probabilities are constrained to lie within the range 
of 0 to 1, the predictions generated from six±i a model are irdxxrided and may talce values from 
minus infinity to plus infinity. Thus, the predictions may lie outside the meaningful range of 
probability and may be inconsistent with tlie probability interpretation that was just presented. 
The simplest yet most statistically sound solution to the probability problem (within a regres- 
sion franework) is the logit transformation, in vMch the probability P. is moiled as 


P. 

i 


rTT^i' 


and the probability of "not P/' is: 


P. = 


1 + e 




where 6X is the vector product of betas nultiplied by row vector of X's (observed explanatory 
variables). Although these two equations are nonlinear models, it is a simple matter to rewrite 
than as 
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Ihe logic cransfonnticn is achieved by taking the natural logarithm of Che preceding formila. 
Uiich yields 


P. 

In B.V 

1 - P. 

1 

Ihis transfonnatian has the property of increasing frcm minus infinity to plus IrfinJt/ as 
increases from 0 to 1. Once efficimt estimators are calculated for the betas, sinple algebix 
vrt.ll extract the value of P.. The methcxi can be generalized tot- classes, in which there 12 
It - 1 logics of the form 



Each logic must be ntjdeled separately, pnxkicing It - 1 sets of betas. As in "iie ' inar>' case 
described above, algebra will extract Che values of the probabilities from Hr? 1 • logits pre- 
dicted for an observation along with the cinstraint chat all probabilities mist sim „o one. 

Uhlilu* Che conventional regression mxlels of Che previous cells, all cc which can be effi- 
ciently estimated by the ordinary least squares (OLS) method, the logi'tic and linear logit re- 
gression nndels appropriate for the problems of cell (d) require either a weighted least .squares 
(ULS) estimation procedure or a maxinun likelihood procedure. The dioice between the two methods 
depends upiTi whether the calibration data include repeated observations for each combination of 
values of the explanatory variables (in the case of WLS) or not (in the case of raaxlnuii likeli- 
hood). Since such replications are inlikely In calibration of the logic model for remotely 
sensed data, maximim likelihood estiimtion is preferred. (Note chat here Che term 'Vnaxiimm 
likellhoiid" refers to a parameter estimation method different from nulcivariate normal MLC.) 

The maximm likelihtxxi solution to the calibration of the logit model has many attractive features. 
It can be slx-wn that provided the sanple (iata are not tail ticol linear, a mique maxii.<Jii likelihood 
estimator can be obtained even in relatively small sai^les. Also, the math^tlcal properties 
of the likelihood fixiction allow for efficiCTt ccmputer programs to produce the parfineter esti- 
mates, and these estimates are consistent and are the best possible estimates in very large san- 
ples. The disadvantages of the procedure are that it involves mirerical optimization and there- 
fore more costly ca.putatiim, and that it is a less faniliar statistical technique. For a de- 
scription of the pnxredure used to calculate such imximm likelihcxid estimates, see Cox (1970, 
p. 87), Mantel and Brcvn (1973, p. 65A-5), ifrigley (1975, p. 191-3), Drmencich and McFadden (1975, 
p. 110-12) and Schmidt and Strauss (1975a, p. 404-5). 

There are two ways tlie logit nuxlel can be used within the context of remote sensing. The 
first is to act directly as a nonparametric classifier (l.e., to classify a pixel into the class 
having the highest predicted probihility) . The second is to use collateral data in a logit model 
to predict prior probabilities and input them to a )liC decision rule which accepts prior proba- 
bilities. This approach effectively cmhines a nonparametric logit nodel for collateral data 
with a paronetric tLC ntxlel; it will he discussed in a following section. The following exanplc 
of the logit model Involves the direct classification of land use by MSS and terrain data. The 
model is 


In ■ 60 + 5i(haml5,0 + 

1 - r,-i 

where P{\ is the probability that pixel • la class I and the follixrtng terms iiidicate linear com- 
binatim of spectral and collateral data. If desired, tlie midel can easily be expanded to all 
four bands aid all the ctmtinuous collateral variables relevant to the classification. 

2.2. 1.2 D iscriminant Atyilysis . Discriminant analysis is a nultlvarlate technique used to 
pnxkice sets of iiKorrelated fimctlons which separate observat iixis nost efficiently into predesig- 
nated groups. A discrlmirumt model, or classlficatliTi criteron. Is developed, the values of 
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which define groups for the observations. The individual observation is classified into one of 
the previously defined gro^is by a neasure of generalized squared distance. 

This technique requires sone difficult confutation. _ Deriving the classification criteria 
requires extracting eigenvectors from the nonsymnetric W~^B matrix, where B and W are respec- 
tively, the between aid within groups suns of squares and crosspro^ts matrices. The mathe- 
matics of this process are beyond the scope of the paper (please refer to Tatsuoka, 1971; Cooley 
and Lobnes, 1971). The technique is helpful in the social sciences for identifying verifies 
which do the best job of separating classes, but is not usually used to process new data for 
classification. Further, the technique assiines that all classes possess an identical dispersion 
matrix, an assuiption uillkely to ch^acterize remotely sensed data. 

2.2.2 CEIL (e) — MIXED EXPLANATORY VARIABLES. Conceptually no new problems are encomtered In 
moving from cell (d) to cell (e) ; categorical explanatory variables are included tiirough dum^ 
variables. The logit model has high potential for application in remote sensing. Since it is 
capable of Incorporating both bontinuous and discrete input data and generating probabilities 
either directly for classification or as input as a collateral data set of probabilities to EtX) 
(Strahler, 1979). In other words, interval level measured data such as rainfall, elevation, 
slope, etc. (no matter what its variance-covariance) can be confcined with discrete data such as 
soil type, previously classified land use, census tracts, etc., in a nonparametric , logit frame- 
work end the result will be a discrete output sudi as a land use map, a soil erodibility map 
(divided into discrete levels) or other special use maps. 


The logit nudel, as in the previous cell, can be easily extended to cell (e). Again, land 
use could be predicted by 


In 




1 - P{i 


Bo + 6i (/'a»i</5,') + 6j(nl<?.r>e,*) + 


vherp Pii is the probability of land use for pixel / over the probability of all the land 
uses that are not bandb; is the E6S value for pixel ttlcpef is slope of the pixel, and 
is a duimy variable with a value of 0 if the previous land use on pixel i was class 1 and 0 
otherwise and D {2 value 1 if the laixl use on pixel i was class 2 and 0 otherwise. 


Preceeding paragraphs have referred to the use of prior probabilities in nodifying the out- 
come of a MLC. Since the prior probabilities can be moiled to reflect both continuous and cate- 
gorical input data, and the input of MLC is a categorical calssification, this tehcnique is ap- 
propriate to discussion of cell (e) . Prior probabilities are incorporated into the classifica- 
tion throu(di the manipulation of the Law of Conditional Probability. The acutal derivation of 
the prior probability is beyond the scope of this paper (see Strahler, 1980). The modified 
decision rule is to choose 1: which minimizes 


F;,^.(Xj) - InlD-J + (X.r - m^.) ’D".' (X.,- - m;^.) - 21nP(e^) 

where the only difference between this formula and the one presented in cell (d) is the probabil- 
ity term. -21nP(0i.) . This form of the decision rule is usually attributed to Tatsuol;a and 
Tiedeman (1954; Tatsuoka. 1971). 

It is isportant to understand how this decision rule behaves widi different prior prob^il- 
itles. If the prior probability P(6;.) is very small, then its natural logaritlm will be a large 
negative mnher; when nultiplied by -2. it will become a large positive nunber and thus F?,i, for 
sudi a class will never be minimal. Therefore, setting a very small prior probability will 
effectively remove a class from the output classification. Note that this effect will occur even 
if the observation vector X: is coincident with class mean vector n^, . In such a case, the quad- 
ratic product distance function (X; - nt)'Dj!*(Xj - m;.) goes to zero, but the prior probability 
term -21nP(e;f) can still be large.' Thus it is entirely possible that the observation will be 
cisssificd into a different class, one for vbich the distance function is quite large. 

As the prior probability P(s^) becones large and approaches 1, its logarithm will go to 
zero and Fj.if will approach Fj.;, for that class. Since this probability and all others must sum 
to one, however, the prior probk)ilities of the remaining classes will be small nurbers and their 


1016 


ORIGINAL PAGE IS 
OF POOR QUALITY 


values of F;,;;. will be greatly aup^iented. The effect will be to force classification into the 
class with hl^ probability. Therefore, the more extrene are the values of the prior probabili- 
ties, the less Inportant are the actual observations values X,*. 

For a minerlcal exanple of how prior probabilities can affect the decision of the mxinun 
likelihood classifier, please refer to Strahler, 1980. There are so many potential applications 
of prior probabilities and the naxinun likeliho^ decision rule that it would be comterprcxiuc- 
tive to list them all. In general, aU data that is relevant to a classification model can now 
be incorporated and this process has been shcMn to significantly increase classification accura- 
cies (Strahler, 1978; Strahler, 1980). 


The versatility of the prior probability techniques comes aKnt when the priors are allowed 
to vary on a plxel-by-pixcl basis. The priors for a pixel may be determined by a liiglt or other 
nxlel, or by using a set of class-conditicnal prior prob^llities estimated by sanpllng. Because 
the priors are cvm;)uted separately, it is possible to mix any sort of model estimating prior pro- 
babilities with a multivariate normal MC algorithm which is known to be well suited to most 
spectral data. Thus, the technique alUws easy, flexible merging of collateral data, used to 
predict the priors, with continuous image data. These points are discussed in more length in 
Strahler (1978). 


2.2.3 CELL (f) — CATEXDRlCrtL EXPLANATORY VARIABLES. Data that falls into this level has been 
traditionally analyzed by contingency table analysis vri.th Chi-Square methods. But statistically 
speaking, it is a sinple matter to extend the logit model of cell (e) to cell (f) through the 
use of diiimy variables. For data that only ceme in nimdnal or ordinal levels, the logit mxlel 
offers new «id in^rtant insights into the data (Wrigley, 1979; Thell, 1970; Grizzle, Starmer 
and Koch, 1%9; Ktx-h et al., 1971, 1972. 1976a, 1977; Laixlis and Koch. 1977; L/hnen and Koch, 
1974«, 1974b). As in earlier discussion, the logit nxxlel can serve directly as a mriparametric 
classifier, using ixily categorical variables input as dutmv variables. In this form, the logit 
mxlel is equivalent to a log linear nuxlel of a contlngencv table; such tnxlels are discussed fully 
in such texts as Bishop et al. (1975). 


The categorical logit mxlel is formilated in the exarple below 


In 


1 - J’„- 




where there are two ixitput categories — 1 ami not 1 — which are modeled by categories of soil 
type as described in cell (e). The classifier sisf^ly assigns the output pixel to the class with 
the higher pn^babllity. 


3. APPLICATIONS 

TVio statistical mxleling techniques, logit mxleling ami maximm likelihoixl classification 
with prior pn<>nbilities, were .selected for further invest igatiixi in the cintext of a real apfill- 
catiiTi. A linear li'tgit mxlel was de\d.sed and fitted to forest species <Mi|xxiitional data for 
nirthem California, predicting the pn>portiixi of tinber vohine for each of five coniferous spe- 
cies at each pixel based in register^ terrain data quantifying elevatiiTi i«xl slope aspect. 
Maximiiti likelitxxxl classification with prior probabilities was tested in Ventura Cointy, Cali- 
fornia in a land use application. A previous Landsat classification and an externally devised 
transition probability matrix were useol together with new Landsat image olata to pnxlix:e an vi- 
olated classification consistent with the observed pattern oif change. 

Throxighoxit this research we Iwive utilizeol the Vlokxi Iraagi' C*mndnicat ion anol Retrieval 
(VICAR) system and the Image Ba.sed Information System (IBIS) reslolent at IC.SB. VICAR/IBIS, ole- 
velopeol at the .let Pn->pulsion l^xiratorv (.IH.) at Pasaolena. Cwillfoimla. II^, is a )ob control 
language which jx'rmlts the sixjuentlal linking and execution of a vast array of Fortran anil 
Assenbler nxitines in a batch enviriTment. In aolditlm to extensive usage of existing VICAR/ IBIS 
nxitini's. now VICAR and non- VICAR software were ckmelopetl and/or modi tied as required for the 
purposes of this resivirch. 

3.1 uxirr irrer iNK of speties mipoRTicNs 
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Logit modeling of species proportions used data derived from the Klamath National f^orest, 
located in northexn California, USA, (llgure 2). Ranging in relief from SOO to 8,000 feet, the 
Forest includes 2,600 square miles of rugged terrain in the Slsldyou, Scott Bar, and Salmon 
Mountains. Little of the area Is develop^ beyond management for tiafcer yield, llvestodc produc- 
tion, aixl recreation. A wide variety of dlstiratl'^ vegetative types Is present in the area. 
Forest vegetation Includes such coniferous species as n^le, red, white, and douglas firs, pon- 
derosa pine, and Incense cedar, as well as several oaks, and typical species of chaparral. Thus, 
the top^a^lc and vegetational characteristics of the area are well differentiated. Within the 
Klamath National Forest, a study area including most of the Gcx>senest Range was selected for 
logit modeling of species conposltlon from terrain features. This area was chosen because cali- 
bration data and Lancisat images were readily available for It. 

The logit model devised for this forestry application requires preparation of digital ter- 
rain data. These data, obtained from the National Cartographic Information Center, In Restexi, 
Virginia, USA, are derived frari pnxesslng of 1:230,000 contour maps, and include elevations at 
every point on a grid of approximately 65 m spacing. Although the data are corparable in scale 
to a Landsat image, the elevation values are quite generalize because they are produced from 
small scale contour traps by interpolation (Figure A). 

Slope angle and slope aspect channels can be produced using the elevation data of the regis- 
tered terrain image. Althou^ a tniiber of slope ^d aspect generating algorithms are known, 
the slnplest Is the fitting of a least squares plane through each pixel and Its four nearest 
nel0nbors and the calculation of the downslope angle and direction of the plane. Slope angle is 
obtained relative to the nmeric range of the elevation channel and image grid spacing, and azi- 
muth Is deteirndned with respect to t±ic rectangular image grid. These channels are best generated 
directly during the preprocessir.g of the original terrain data. At that time, slope angles and 
aspects can be calculated from naif-word absolute elevations arrayed in a north-south east-west 
grid. 


The slope aspect image (Figure 5) consisted initially of gray tone densities between 0 
(black) and 255 (white) v«hich indicated the aziauth of slope orientation, ranging clockwise from 
0® to 359°. These values were then transformed according to the fxmetion below: 

nr.-'den - 3.0 + 126.0’'^(1.0 + cos( .024933275*(o/dde»7 - 26.1))) 

v^ere oLdden synbolizes the old (azimjth-keyed) gray tone pixel density value, neuden synbollzes 
its transformed value, and the argunent of the cosine function is expressed in radians. This 
fuiction transforms density values according to an orientation proposed by Hartung and Lloyd 
(1%9). Since northeast slopes present the most favorable growing environment, and southw^t 
slopes the least favorable, with northwest and southeast slopes of neutral character, the den- 
sity tone azinuths were rescaled by a cosine function with 3 representing due northeast and 255 
representing due southwest. Neutral slopes, oriented northwest or southwest, thus received den- 
sity tones near 128. Flat pixels were coded with zeroes. The function also corrects automati- 
cally for the 12° skw of the Landsat image. 

The logit model fitted is shown below: 

... 

ln(— ) ' 01 ,;;.+ ^ - 1.5 

vhere is the probability that the board-foot of tiitber volune will be drawn from one of five 
species P.j, is the probability that the board-foot will not be drawn from species Jt, F is ele- 
vation (compressed to 0-255 range). A is aspect transformed as described above, S is slope angle, 

and are tlie estimated regression constants. Note that five equations, one for each 

species,' actually conprlse the model. The model was calibrated using 73 measurements of tiaber 
volune prepared by tlie U. S. Forest Service and located within two subregions of the Goosenest 
range. These sanples are probably not representative of the entire area modeled, but serve for 
the dennfistration purposes of this research. Each sanple was located on 1:15.840 scale color 
air photos and tranferred to Band 5 of the Landsat images to obtain the line and sanple coordi- 
nates of the sanple point. The coordinates were then used to extract elevation and aspect values 
for the sanple from thr> registered elevation and aspect images. The coefficients for the model 
were fitted by a nonlinear optimization algorithm enplo\’ing the Newton-Raphson method to .select 
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coefficients with maxiioin likelihood. 

Given the constants produced by the procedure discussed above, the probability linages were 
created using the new VICAR program "FROBMAFS." FRDBMAPS, written specifically for this appli- 
cation, calculates the probability of species k for each pixel using the following expressly: 




v*ierr Qj. - exp (Bi.jt + §2,;^ + 83, + 84 , 1 ^) 


FROEMAPS then scales eadi probability so that the range 0-255 represents 0. to 1. FRDBMAPS out- 
put images for this exonple are shown in Figures 6-10. Brightness values in Figures 6-9 repre- 
sent probabilities of occurrence for douglas fir, ponderosa pine, white fir, and red fir, re- 
spectively, with probabilities scaled to range fnxn black (0.) to white (1.0). Figure 10, incense 
cedar, has been contrast stretched for display purposes, and presents a probability range of 0. 
to .3 from black to white. The probability irages represent maxlnun likelihood estimates of 
species proportions; they appear reasonable in li^t of the known ecological preferences of the 
species, but their accuracies ranain to be detemtbbed. 

The probability imagoes produced by PROttlAPS can be thought of as predictions of the propor- 
tion of the tinher volume expected for each species at each pixel. Ihis view inplies a ccxitinu- 
cxis mixture of species, constatly varying in response to elevation and slope aspect. An alter- 
native view is that forest stands are monospecific, and that each pixel is dominated by a parti- 
cular species or forest cover type. In sixh a case, the modeled values are probabilities that 
the pixel will be ciondnated by a partlcnilar species or stand type, and it is therefore appropri- 
ate to produce a single output image indicating the type with highest probability for each pixel. 
In this way. the logit model can serve as a nonparametric classifier. The prob^ilitles can also 
be viewed as prior probabilities, and input to MLC of an image using spectral data. This pro- 
cedure amoimts to mixing a nonparametric model for collateral data (terrain channels) with a 
parametric model for spectral data (Landsat channels) . 

3.2 LAND USE CLASSIFICATION USING TRANSITION FROBABILITIES 


An additional objective of our research was to apply the method of maxinun likelihood classi- 
fication with prior probabilities to a land use/ land cover classification (see sections 2.2.1 and 
2.2.2). In this exanple, land use/land cover maps for two 7'2-mlnute quadrangles in Ventura 
County, California, U.S.A., were obtained from photointerpretation of hif^-level U-2 aircraft 
imagery for tne years 1973 and 1976. These maps, with inherent accuracies considerably hirfier 
than those of Landsat classifications, were used as "groind truth" to construct a matrix of tran- 
sition probabilities, showing the probability of change of classification for each land use/land 
cover type to each other type in the three year interval. With a 1976 Landsat image as input data, 
we planned to carry out MLC of each pixel using the 1973 U-2 derived cover class as a collateral 
data channel indej^g the transition probabilities appropriate to the 1973 cover class. The 
resulting classification was to be conpared with the 1976 U-2 derived map to evaluate the accuracy 
of the technique. At present, we have used image overlay techniques to create the transition 
probability matrix, but the classifications using the transition probability matrix have not been 
carried out. 

3.2.1 FRDCEDURE. Using the Jet Propulsion Lab's (JH-) Image Based Information Svstem (IBIS) and 
a coordinate digitizer, it is possible to merge image data in digital form with other types of 
geographic data. The IBIS is essentially a fine-mesh grid information system which is conpatible 
with the handling and storing of digital image data. By allcxdng a user to overlay thematic map 
and digital Landsat data (or pertinCTt Landsat -derived data) with IBIS, it is possible to derive 
the values that corprlse the transition probability matrix, as well as determine the accuracy of 
the thematic Laidsat classification data. 

Prior to IBIS processing, the two "ground truth" land use/land cover maps for the two- 
quadrangle study area were prepared through photointerpretation of NASA high-altitude color infra- 
red imagery with additional ground checks. A coordinate digitizer board was used to convert the 
maps to a series of digital coordinates. Polygons of thematic land cover categories were captured 
by digitizing overlapping line segments that conprise such polygons. Polygon centroids were also 


1019 




digitized and assigned an appropriate land use/Iand cover label for later use in converting the 
polygonal data to raster (linage base) form. 

The digitized line segnent data were processed using IBIS as follows . A modified version 
of the IBIS program POLYGEN converted the coordinate digitizer se^nent data into polygons in 
the form of an IBIS graphics file. Following this reformatting, the program PCL.YREG rigidly 
rotated the polygons to conform with the Landsat data for the test site. PtXYSOW was used to 
convert the polygon data into raster form (fine-mesh grid) . The result was an image of ras- 
terized polygon borders representing the edges to theiratlc land cover vnlts. 


The next phase in the IBIS processing involved the assignnent of labels to the rasterized 
polygon areas. An intermediate categorized image was automatically produced by the progr a m 
PAIm, vdiich assif^is an arbitrary but unique brightness moiber to all the pixels within each 
rasteri:%d polygon. This output image was conbined with the polygon centroids that had also 
been converted to an IBIS graphics file format (with V2PQLY) and rigidly transformed to over- 
lay with the polygon borders (POLYREG). The progratn CTWIATCH was used to establish the corre- 
spondence between centroid labels and polygon brifditness mnbers, and the progran STRETCH was 
used to reassigji identical bri^tness values to polygons with si^lar l^ls, yielding a raster 
format Image corresponding directly with the original land use/ land cover map. 


With both land use/land co'er images in digital image format, the next step is to overlay 
them using POLYOVLY to analyze the cha^e occurring between the two dates in order to estimate 
the ti'ansition probability iraurlx. The output of POLYOVLY is a table counting the nuiiber of 
pixels in each confcination of classes across the two images. If denotes the count of pixels 
in class i of the first image, and class J of the second, then the transition probability T,- r 
for class i to J is S-' j The digitized map of 1976 land use/land cover was als6‘ 

• 


used to assess the accuracy of .^IC of Landsat data, again using, the POLYOVLY program. Accuracies 
discussed below were obtained ■'n this fashion. 


iJith a digitized 1973 land use/land cover classification map produced from air photo inter- 
pretation now in hand as collateral data channel registered to the 1977 landsat image, and with 
a transition probability matrix to provide sets of prior probabilities contingent as a col- 
lateral data channel, it should be possible to carry out maxiiiiin likelihood classification using 
the transition probabilities as prior probabilities indexed by the 1973 classification. Future 
work incluxies performing the classification, and overlaying it on die 1976 digitized map for 
accuracy analysis. Initial indications are that accuracies will Increase with the use of the 
1973 digitized map data, demonstrating the successful use of Landsat to update existing manually 
produced land use/land cover maps. 


4. CONCLUSION 

Of the large menber of statistical techniques vbich can be used to develop models coobining 
remotely sensed images with collateral data in a ccrmon predictive framework, two tedmiques 
are of special interest for remote sensing: logit modeling and maxinun likelihood classification 

with prior probabilities. These methods allow the construction of nonparametric classification 
models utilizing both image and collateral data channels as well as the mixing of paranetric and 
nonparanetric classification models for image and collateral data respectively. Both techniques 
have been successfully demonstrated in applications using Landsat imagery; each has the potential 
to greatly increase classification accuracy through the use of collateral data, and each should 
find wide application in future research and development in remote sensing. 
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Table I. Notation 


TERM DEFINniON 


y Vector of estimated dependent variables; where . signifies a vector 

and si^ifies can estimator. 

o,> Regression notation; vector of estimated betas and vector of observed 

' ■ error terms. 

I Nunber of measurement variables used to characterize each object or 

observation. 

X A ,■ -dimensional random vector. 

X. Vector of measurements on • variables associated with the ‘th object 

or observation: ;“1,2 N. 

Menber of the i-th set of classes *—1,2 . 

P(h. ) Probability that an observation will be a menber of class h,, ; prior 

probability for class . 

;.,(X.) Probability density value associated with observation vector X as 

evaluated for class *•. 

;.. Paranetric mean vector associated with the t-th class. 

y 

m. Mean vector associated with a sanple of observat iens belonging to 

the i'th class; takf>n as an estimator of u.. . 

Parametric ■ by ; dispersion (variance-covariance) matrix asscxria- 
ted with the th class. 

D, ; by .* dispersion matrix as.sociated with a sanple of ebservations 

belonging to the - th class, talien as an estimator of '.. . 
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Figure 1. Techniques for Conbinlng Continuous md Categorical Data 
(modified from Wrigley , 1979). 
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Figure 2. Klanath *;ational Forest Fi=wre 3. Landsat 'tTF band 5. Goose- Figure 4. Registered elevation imagi 

location trwp. nest test area, Klanath '.lational (dB^^^a»d frcm N.C.I.G. 1:7.SO,000 dJ- 

Forest, California, U.S.A. Scale: gital terrain data), 

pixel - 80 m X 80 m. Note: Forth is 

9° covnterclocla.T-se fror ud. 
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A LOGIT CLASSIFIER FOR MULTI-IMAGE DATA. 


Alan H. Strahler and Paul F. Maynard 


fi®narttnent of f’enoranhy 
University of California 
Santa Barbara, California 93106 


A new classifier for multi-ima'’e databases 
uses maximum likelihood estimation of narameters 
fitting a logit model to training data. A looit 
is the natural logarithm of a orobability ratio: 
In (P]_/Po). As an examole, a linear looit 
classification model for a simole two-class case 
based on four Landsat channels is: 


ln(]ir) = 80+81 (LS4)+S:(LS5)+03(LS6)+|j4(LS7) 
where Pj ind P^, are orobabil ities that the nixel 
belongs to class 7 and • resoectively, 6 q_ _a,, 
are calibration constants, and LS4...LS7 and 
the four Landsat channels for bands 4 through 7. 


Comcared with usual Bayesian maximum like- 
lihood classification, the logit classifier has 
certain distinct advantages. It is nonnarame- 
tric, in that multivariate normalitv is not as- 
sumed. The model mav be soecified in linear or 
curvilinear forms as aonrooria+e. Further, the 
model can incoroorate catenorical information in 
the form of duimiy variables, and can therefore 
be used to merge continuously measured imane 
data with categorical collaterial data in a 
single classification steo. 


Introduction 

The Bayes maximum likelihood classifier (*’LC) 
is the most coimtonly used decision rule for dis- 
crete classification with Landsat derived data. 

Such a classifier uses the Baves decision rule to 
assign pixels to the class with highest nrobabilitv, 
given the observed vector of snertral measurements 
and the orior distribution of classes. To find 
this orobability, the Bayes MLC reguires an esti- 
mate of the conditional orobabilitv of occurrence 
of the observed vector of MSS data given that it is 
associated with a specified class. This estimate 
has traditionally been obtained by assuming that 
the observed measurement vectors were Gaussian, or 
normal; therefore, the best estimate is a measure 
of the probability density value of the multivari- 
ate normal distribution for the class evaluated at 
the observation. 


With four channels 0 ^ Landsat soectral data, 
the Bayes maximum likelihood classifier has achieved 
good classification accuracies in many cases. 
Classification accuracies have been further boosted 
by the inclusion of collateral data as additional 
logical channels or as Indexes to sets o+ orior 
orobabilities in the case of catenorical collateral 


variables.* Further increases in classification 
accuracy will undoubtedly result from more ootimal 
soectral channels and imoroved technigues of in- 
coroorating collateral data. 

However, it is very likely that the soectral 
reflectances from some classes in many aocli cations 
are not normally distributed. In this situation, 
even with the ideal soectral "windows" and precise 
and relevant collateral data, the Bayes MLC will 
reach an asymptotic accuracy limit that will be 
less than ootimal. In fact, for classes that de- 
viate from normality, classification accuracies 
could be significantly less than optimal. 

Furthermore, each of the two previously men- 
tioned technigues of increasing accuracy by innut 
of collateral data to the Bayes classifier has 
critical limitations. The Bayes MLC can only 
accept measurement variables which are continuously 
distributed, and this reouires continuous measure- 
ment (or at least discrete measurement with a 
large number of discrete steos). Conseguenti v, 
the collateral channel in the direct innut aooroach 
can utilize onlv data which are measured on the 
interval or ratio scale. This reauirement elimi- 
nates many notentially useful databases, such as 
soils maps, geologic maps, oolitical boundaries, 
census tracts, etc. And when collateral informa- 
tion is incorporated through the mechanism of 
prior Probabilities, the calculation of the orior 
nrobabilities usually reguires a sophisticated 
samol ino design. 

One solution to these orobloms is to use a 
statistical technique that predicts orobabilities 
of categorical membership with no distribution 
assumptions. One such technique, called the lonit 
regression model, has been widely used in the 
social sciences for the last twenty years. The 
logit regression model generates predicted oreba- 
bilities that all sum to one for a soecified suite 
of classes, and the classification can be awarded 
to the category with the highest predicted proba- 
bility. Further, predictor variables may be con- 
tinuous or categorical; the model may be specified 
in linear or curvilinear forms; and the assumotion 
of multivariate normality is not reguired. 

This paper has the following components: 

1. a review of the mathematics of the Baves MLC; 

2 . an examination of the sources of classification 
error due to non-normal distributions; 
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3. an examination of the problems encountered with 
utilizing collateral data; 

4. development and exposition of the logit regres- 
sion model, with an emphasis on Its ability to 
act as a nonparametric classifier; 

5. a description of planned research which will 
compare the classification accuracies of the 
logit regression model and the Bayes MLC for a 
land use/land cover classification example; and 

6. presentation of an example of the use of linear 
logit model In a remote sensing application. 

The Bayes Maximum Likelihood Classifier 
Background 

In the past ten years, maximum likelihood 
classification has found wide application In the 
field of remote sensing. Based on multivariate 
normal distribution theory, the maximum likelihood 
classification algorithm has been in use for appli- 
cations in the social sciences since the late 
1940' s. Providing a probabilistic method for rec- 
ognizing similarities between Individual measure- 
ments and predefined standards, the algorithm found 
increasing use in the field of pattern recognition 
In the following decades.^ In remote sensing 
development of multi-spectral digital Images of 
land areas from aircraft or spacecraft provided the 
opportunity to use the maximum likelihood criterion 
in producing thematic classification maps of large 
areas for such purposes as land use/land cover de- 
termination and natural cultivated land inventory. ^ 

Derivation of the Bayes M IC 

In order to understand the difference between 
a classification awarded on the basis of lonlt-gen- 
erated predicted probabilities and posterior proba- 
bilities derived from the Bayes MLC, it will be 
helpful to briefly review the mathematics of the 
Bayes maximum likelihood decision rule. In the 
multivariate remote sensing application, it Is 
assumed that each observation ,r (pixel) consists of 
a set of measurements on r variables (channels). 
Through the selection of training sites, a set of 
observations which correspond to a class Is identi- 
fied -- that is, a set of similar objects charac- 
terized by a vector of means on measurement vari- 
ables and a variance-covariance matrix describing 
the interrelationships among the measurement vari- 
ables which are characteristic of the class. Al- 
though the parametric mean vector and dispersion 
matrix for the class remain unknown, they are esti- 
mated by the sample means and dispersion matrix 
associated with the object sample. 

Multivariate normal statistical theory de- 
scribes the conditional probability that an obser- 
vation X will occur, given that it belongs to a 
class as the following function: 

P(.Y|u.j;.l = (2^)*‘‘^’’ |r^r'»f>*‘‘^-^-‘'::)’T^-i(.r-ui.) 

(Please refer to Table 1 for a descriotion of the 
mathematical symbols). As applied in a maximum 
likelihood decision rule, expression (1) allows the 
calculation of the conditional probability that an 


observation is a member of each of k classes. How- 
ever, the actual probability desired is the poste- 
rior probability Plaij^lx); it can be shown** that: 

P{u^\x) - ( 2 ) 

This expression leads to the decision rule: 

Choose k which minimizes 

ln|D^|+(Y-m;;,)'0''(Y-m,^)-21nP{a,^). (3) 

In usual practice the prior probabilities P(t*ij^} are 
assumed equal, or P{id, )=!//? where H is the number o^ 
classes. In this case, the last term in expression 
(3) is constant over all P classes, and need not be 
considered in the decision rule. This equal priors 
decision rule is used in the currently distributed 
versions of LARSYS and VICAR, two image processing 
systems authored respectively by the Laboratory for 
Applications of Remote Sensing at Purdue University 
and the Jet Propulsion Laboratory of California 
Institute of Technology at Pasadena. 

Classification Accuracy and the Assumption of Nor - 
mal U'y 

In a typical supervised classification, train- 
ing sites are selected by the analyst to typify 
each class. Histograms of spectral values for 
classes are inspected for multivariate normality, 
and when a class actually consists of several dis- 
tinctive signatures, training sites are reaggre- 
gated into subclasses, each of which is aporoxima- 
tely multivariate normal. In this way, a set of 
multivariate normal disnersion patterns are defined 
for the desired classes. It is imoortant to real- 
ize that such dispersions, because they are se- 
lected to be as "pure" as oossible, are probably 
underdispersed with respect to the true information 
class. This effect produces a difficulty in the 
classification of mixed pixels. Since the MLC 
model does not provide for mixed pixels, the impli- 
cit assumption is that mixed pixels are to be clas- 
sified according to the most probable signature 
match; the components of the signature which are 
reflected from the less important classes contained 
within the pixel are thus regarded as random noise. 
The mixed pixel, then, is typically classified by 
comparing probability densities within the tails of 
overlapping multivariate normal distributions. The 
accurate classification of mixed oixels under MLC 
thus requires a good fit of the tails to multivari- 
ate normality; however, it is obvious that the 
selection of training sites for purity will of 
necessity produce a poor fit in the tails. And, 
mixed pixels will constitute a large oortion of the 
scene -- up to forty oercent in some agricultural 
applications (F, Hall, personal communications). 

This reasoning naturally leads to the consi- 
deration of nonparametric classifiers. Brooner et 
al.” compared the Bayes MLC to a classifier which 
used P(r|u)jjl as directly estimated by a sampling 
procedure, and reported a four oercent increase in 
classification accuracy over MLC. However, direct 
estimates of P(.T |ui.l reouire more data as well as 
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dealing with r-dimensional table storage problem. 
The logit model, discussed in the following pages, 
provides an alternative which should require fewer 
data to calibrate and can, by virtue of its curvi- 
linear modeling, approximate the real distribution 
without assuming multivariate normality. 

Problems with Incorporating Prior Probabilities 

As shown earlier, the Bayes MLC can easily be 
modified to take into account prior probabilities 
which describe how likely a class is to occur in 
the population of observations as a whole. The 
prior probability itself is simply an estimate of 
the proportion of pixels which will fall into a 
particular class. These prior probabilities are 
sometimes termed weights, since the modified clas- 
sification rule will tend to weight more heavily 
those classes with higher prior probabilities. 
Strahler* showed via simplified numerical examples 
how these different weights can affect the decision 
of the Bayes MLC. As the prior probability P{u^) 
in expression (3) becomes large and approaches I, 
its logarithm will go to zero and the classifica- 
tion decision will effectively be made with ex- 
pression (1). 

However, since this possibility and all 
others must sum to one, the prior probabilities of 
the remaining classes will be small numbers, thus 
increasing the value of the expression. Since the 
classifications is ararded to the class with the 
smallest value, the effect will be to force classi- 
fication into the class with high probability. 
Therefore, the more extreme are the values of the 
prior probabilities, the less important are the 
actual observation values X. 

Strahler* has demonstrated how prior proba- 
bilities can be used as a mechanism to incorporate 
collateral data in categorical from into the Bayes 
MLC. His mechanism uses a set of prior probabili- 
ties estimated for each collateral category by an 
external sampling procedure, with the classifica- 
tion algorithm accessing the appropriate set of 
probabilities contingent upon the collateral cate- 
gory of the pixel. In this fashion, categorical 
collateral information is merged with multivariate 
normal information concerning the spectral signa- 
tures. Although this approach was proven effective 
for a forestry appl ication,' estimations of the sets 
of prior probabilities may require considerable 
data collection, depending on the number of classes 
and collateral categories. In addition, multivari- 
ate normality of signatures is assumed, and the 
comments of preceding paragraphs apply. In con- 
trast, use of the logit classification model allows 
categorical and continuous information to be mixed 
freely through the mechanism of dummy variables. 
And, again, the model can be fitted in a linear or 
curvilinear fashion as desired, avoiding the 
assumption of multivariate normality. Thus, the 
logit model offers a more natural, straightforward 
way of incorporating categorical variables into 
the classification procedure. 


The Logit Regression Model 

Lin ear Modeling of Probabilities 

The most conmonly used predictive multivariate 
statistical technique is probably ordinary least 
squares (OLS) regression. The prediction, or esti- 
mated value of the dependent variable, is a func- 
tion of the vector of estimated betas (g) in combi- 
nation with the vector of observed independent 
variables (.v). The betas are estimated in such a 
way that the variance about the least squares re- 
gression line is minimized. 

When used to model probabilities, OLS regres- 
sion has one major drawback: although orobabili- 

ties are constrained to lie within the range of 0 
to 1, the predictions generated from such a model 
are unbounded and may take values from minus infin- 
ity to plus infinity. Thus, the predictions may 
lie outside the meaningful range of probability. 
Further, the probability of each class must be 
modeled separately, and there is no constraint to 
ensure that all probabilities must sum to one. 

One solution to the bounding problem is to 
specify that 

0 < P. < 1 

— t — 

(where P. is the probability of observing a speci- 
fied class or category of the dependent variable). 
In the case of ordinary least squares regression 
model , 

= Bo + fli-Tn + 62X^2 . 

The simplest way to satisfy this condition is to 
impose the following arbitrary definition of P^. : 

1. P.. is eoual to 0 if Y. is less than 0; 

2. “..is equal to Y. if V. is equal to or between 

0 and 1. ^ ^ 

3. is eoual to 1 Y. is greater than i; 

and use straightforward ordinary least squares 
estimation of the regression parameters. This 
solution is often referred to as the linear proba- 
bility model ■ Unfortunately, although it appears 
to be a simple solution to the predicted probabili- 
ties problem, the model has a number of serious 
limitatiq,ns which are discussed by Domencich and 
McFadden. Again, the probabilities are not con- 
strained to sum to one. 

The Logit Model 

The simplest, yet most statistically sound, 
solution to the probability problem (within a re- 
gression framework) is the logit transformation. 

In this transformation, the ratio between the pro- 
bability that an observation or pixel ’ belongs to 
a class Pj and the probability that it does not 
belong to Pj is expressed as a logistic function: 


20 


ORIGINAL PAGu !S 
OF POOR QUALITY 


t 


r 


where t* Is fl vector of parameters and \ Is a veitor 
of observations on indeoendent variables. Talking 
the natural logarithm of this expression, 

) • b,T • (5) 

i’. 1 

The left-hand quantity is referred to as a logit. 
Note that when tcT Is zero, the ratio will be 1, 
indicating equal probability. As pJt varies oosi- 
tivelv or negatively, the ratio will shift accord- 
ingly. 

The ratio, uiuier the constraint that the numer- 
ator and denominator must sum to one, determines 
the two probabilities uniquely. Ixpression (4) 
can be solved explicitly for P, : 

n . • «•) 


And, if P- is defined as l-Pj. it is easv to show 
that 


P. 


1-P. . 

1 , • 


1 


(71 


Although expressions (4) and (bl show the 
product pt as a linear function, the \ vector mav 
contain powt'rs and cross products in the case of a 
curvilinear nx'del. An example is (eleim'ittal nota- 
tion!: 


requires more calibration data than MLC because of 
the larger number of parameters which need to be 
estimated. 

Lo£it Example 

The following is an example that uses contin- 
uous and categorical explanatory variables to esti- 
mate 

P. 

In -a— ) ■ do ^ (q) 

where P.. ./(1-P. j) is the ratio of the probabilitv 
that pix^l t' is hot of class 1. T.j refers to a con- 
tinuously measured variable on pikel •' (MS^ or con- 
tinuous collateral data), and D, is a dummy variable 
th.»t is equal to one if a categorical variable is 
true at pixel ' and zero if it is not. Given the 
logit ratio, simple algebra will extract the value 
of P,. . It is straightforward to add more con- 

tiniiPtiS and categorical explanatory variables. 

Istincsting the .Re£r«sion Parameters 

In order to calculate th« logit ratio in the 
preceding formula, it is necessary to obtain esti- 
itvUes of the regression narameters. The first step 
is to specify Ihe model in terms of a likelihood 
function. If the training observations are thought 
ot as independent trials, then the likelihood of 
the outcome of these trials (for the two-class case 
described above) is; 


P. 

ln( )-dj ,r, 

1-P. 


( 8 ) 


for the bivariate case. 

Unlike the conventional regression models, the 
logistic and linear logit reqi'ess ion model s require 
either a weighted least squares (WIS) procedure or 
a maximum likelihood procedure to estinvite the 
calibration parameters (betas). The choice between 
the two mt'thods deiH'iuis upon whether or not the 
samtile under investigation includes repeated ob- 
servations for each combination of values of the 
explanatory variables. If so. WIS is appropriate; 
however, renxite sensing appl icat ’ ons rarely have 
repeated observations. Consequent 1v. the method 
of nxvximum likelihood is the preferred method. 

A number of authors present the details of this 
method, which is discussed briefly in a following 
sec t i on 

M.iximum likelihood estimation of loqit model 
parameters has many other attractive features. 
Provided that the sample data are not multicol- 
1 inear, a unique maximum likelihood estimator can 
be obtained even in relatively small samples. 

Also, the nvithematical properties of th«* likelihood 
function allow for efficient computer programs to 
produce the ;\iramt«ter estimates. These estimates 
are consistent and are the best possible estimates 
in very large samples. The disadvantages of the 
procedure are that it involves numerical optimi- 
zation and therefore more computation, and that it 
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n 


(i-^.i) 


( 10 ) 


where observations ;’»1 to nj are those in which the 
observed dependent variable was a memN’r of class 1 
and !‘»ni ♦ 1 to n are the observations in which the 
dependent variable was not a member of class 1. 
Substituting from the definitions of P,* j and l-P,- j 
in expressions (61 and (7), the result (s 
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( 11 ) 


As specified, the lik^'lihood depends upon a 
set of unknown parameters, the betas. These para- 
meters are estimated by choosing those values which 
maximize the preceding likelihood formula for the 
given set of training data. Rather than maximize 
the likelihood itself, it is computationally sim- 
pler to maximize the logarithm of the likelihood, or 

In L - r M- 1n(Ur?M. (12) 

I ■ 1 ■ 1 

To maximize this expression it is not possible to 
set the partial derivatives of In (L) with respect 
to the betas to zero, and solve simultaneously for 
the betas in a direct fashion. Instead, the solu- 
tion must be obtained by iteratively recalculating 
In L for successive estimates of betas until the 
partial .lerivatives converge upon zero. 
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There are several mathematica’ techniques for 
iteratively converging the first partial deriva- 
tives to zero. One well-known technique Is the 
Newton-Raphson Method, which calculates deltas 
(amount of change) for the betas by forming the 
matrix of second partial derivatives, inverting It, 
and postmultl plying It by the vector of the first 
partial derivatives. That is, 

d = E‘*C, 

where J Is the vector of calculated deltas, t"* Is 
the Inverted second partial derivative matrix, and 
(j is the vector of first partial derivatives. The 
deltas are subtracted from the betas, and the sec- 
ond partial derivative matrix and first partial 
derivatives are recalculated with the new betas. 

The process continues until the vector of first 
partial derivatives has converoed upon zero, at 
which point the most likely vector of betas has 
been identified. 


Given these maximum likelihood estimates of 
the betas, the last step in a logit model classifi- 
cation sequence is to use exoressions (6) and (7) 
to calculate the vector of probabilities for each 
pixel and award the classification to the category 
with the largest predicted probability. 


Polychotomous Logit Regression 


The preceding example, although conceptually 
straightforward, is not applicable when there are 
more than two categories to be predicted. The 
dichotomous logit can be easily extended to the 
polychotomous logit. Now, instead of two catego- 
ries of interest, there are •? possible categories. 
The model now becomes: 


P. 

t,r> 


.M: 


P 

r w 
••=1 




(13) 


where P,. , is the probability that pixel belongs 
to the >’tH category. 


Because of the constraint that the probabili 
ties must sum to one, only .'’-1 sets of betas and 
probability ratios need to be determined. Intro- 
ducing this constraint on the .“^th class, it is 
easy to show algebraically that 


P. .. = \-l P. 

e=i I , 




(14) 


This constraint can also be introduced by taking 
g = 0, which produces an identical exoression from 
substitution into expression (13). 


for estimates of the betas, the maximum like- 
lihood estimation procedure, described above for 
the two-class case, is generalized to the .<?-class 
case in a str.aiqhtforward manner. 


P roportion Estimati on 

Although the discussion above has stressed the 
use of the logit model for classification, it may 
also be used for proportion estimation. Nelepka 


Klamath National Forttt Test Site Location 



Figure 1. Index map showing location of area 
modeled in Klamath National Forest. 

et al . and Woodcock et al. have both discussed 
this problem using underlying assumptions of multi- 
variate normality. The looit model provides an 
alternative which does not assume multivariate nor- 
mality and estimates proportions directly. As in 
classification, either linear or curvilinear models 
may be selected, and caoeqorical variables may be 
readily utilized as well. The difference between 
application of the logit model as a classifier and 
as a proportion estimator lies in the nature of the 
calibration data. For the classifier, training 
observations of .r vectors (for pixel ?‘) are each 
individually labeled with a single class; in the 
case of proportion estimation, each observation 
contains the observed proportions of classes and 
the associated measurement vectors. These propor- 
tions constitute weights, and it is easy to show 
that the likelihood function becomes (as in the two- 
class easel- 


In (L) 

t - I T - 1 


r B,.r. +S-.Y. , 

: ln(l + ’ ), 

:‘ = 1 


where .J. is the proportion for the »th observa- 
tion for’class i-, and u is the observation weight 
(which, as a constant, is eliminated in differen- 
tiation of the fraction). 
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Application Example 

At the present time, the logit-based classi- 
fier has not been tested, although a logit model 
has been used In a forestry remote sensing problem 
of proportion estimation. This use Is summarized 
In the paragraphs below. For this application, a 
linear logit model was devised and fitted to forest 
species compositional data for northern California, 
predicting the proportion of timber volume for each 
of five coniferous tree species at each pixel based 
on registered terrain data quantifying elevation, 
slope, and aspect. This research utilizes the 
Video Image Coimunlcatlon and Retrieval (VICAR) 
system and the Image Based Information System 
UBIS) resident at the University of California, 
Santa Barbara. VICAR/IBIS, developed at the Jet 
Propulsion Laboratory (JPL) at Pasadena, California, 
Is a job control language which permits the sequen- 
tial linking and execution of a vast array of 
Fortran and Assembler routines In a batch environ- 
ment. In addition to extensive usage of existing 
VICAR/IBIS routines, new VICAR and non-VICAR soft- 
ware were developed and/or modified as required for 
this application. 

Logit modeling of species proportions used 
data derived from the Klamath National Forest, lo- 
cated in northern California (Figure 1). Ranging 
In relief from 500 to 8,000 feet, the Forest in- 
cludes 2,600 square miles of rugged terrain In the 
Siskiyou, Scott Bar, and Salmon Mountains. Little 
of the area is developed beyond management for 
timber yield, livestock production, and recreation. 

A wide variety of distinctive vegetative types Is 
present in the area. Forest vegetation Includes 
such coniferous species as noble, red, white, and 
douglas firs, ponderosa pine, and incense cedar, 
as well as several oaks, and typical species of 
chaparral. Thus, the topographic and vegetatlonal 
characteristics of the area are well differentiated. 
Within the Klamatt National Forest, a study area 
Including most of the Goosenest Range was selected 
for logit modeling of species comoosition from 
terrain features. This area was chosen because 
calibration data and Landsat Images were readily 
available for it. 

Digital Terrain Mode l 

The logit model for this forestry application 
requires preparation of digital terrain data. 

These data, obtained from the National Cartogra- 
phic Information Center, in Reston, Virginia, are 
derived from processing of 1:250,000 contour maps, 
and include elevations at every point on a grid of 
approximately 65 m spacing. Although the data are 
comparable in scale to a Landsat image, the eleva- 
tion values are quite generalized because they are 
produced from small scale contour maps by interpo- 
lation. 

Slope angle and slope aspect channels can be 
produced using the elevation data of the regis- 
tered terrain image. Although a number of slope 
and aspect generating algorithms are known, the 
simplest is the fitting of a least squares plane 
through each pixel and its four nearest neighbors 
and the calculation of the downslone angle and 


direction of the plane. Slope aspect was trans- 
formed from a coding of zero to 255 representing 0® 
to 359®to a cosine function shifted by 45®. This 
function, proposed by Hartung and Lloyd,'^ contrasts 
northeast-facing slopes, which present a favorable 
cool, moist growing environment, with hot, dry 
southwest- facing slopes. Although the function is 
defined ecologically. It also simulates Lambertian 
reflectance from a light source placed in the north- 
east, and thus the aspect Image shown In Figure 2 
gives the strong visual Impression of relief. 

Logit Model 

The logit model fitted Is 

In(^) = 6i,;^ + 62 ,iJ ♦ Bj,^^ + Bm,^, k . 1.5, 
k 

where P. is the probability that a board-foot of 
timber Volume will be drawn from one of five soecles 
k, P. is the probability that the board-foot will 
not Be drawn from species P Is elevation (com- 
pressed to 0-255 range), A is aspect transformed as 
described above, S Is slope angle, and ..., 64 ,;^ 
are the estimated regression constants. llo{e that 
five equations, one for each species, actually com- 
prise the model. The model was calibrated using 73 
measurements of timber volume prepared by the U. S. 
Forest Service and located within two subregions of 
the Goosenest range. These samples are probably not 
representative of the entire area modeled, but serve 
for the demonstration purposes or this research. 

Each sample was located on 1:15,840 scale color air 
photos and transferred to Band 5 of a registered 
Landsat image to obtain the line and sample coordi- 
nates of the sample point. The coordinates were 
then used to extract elevation and aspect values for 
the sample from the registered elevation and aspect 
images. The coefficients for the model were fitted 
by a nonlinear optimization algorithm employing the 
Newton-Raphson method described above. 

Given the constants produced by this procedure, 
the probability images were created using the new 
VICAR program "PROBMAPS." PROBMAPS, written speci- 
fically for this aoolication, calculates the pro- 
bability of species k for each pixel i using the 
following expression: 


PROBMAPS then scales each probability so that the 
range 0=255 represents 0 to 1. PROBMAPS output 
images for this example are shown in Figure 2. 
Brightness values represent that probabilities occur- 
rence for douglas fir, ponderosa nine, white fir, 
and red fir, with probabilities scales to range 
from black (o.) to white (1.0). The incense cedar 
image has been contrast stretched for display pur- 
poses, and presents a probability range of 0. to .3 
from black to white. The probability images repre- 
sent maximum likelihood estimates of species propor- 
tions; they appear reasonable in light of the known 
ecological preferences oF the species, but their 
accuracies remain to be determined. 
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Figure 2 . Clockwise from upper left: cosine function of slooe aspect; probability Images of coniferous 
forest species red fir; white fir; Incense cedar; ponderosa nine; and douglas fir; for Goosenest test area 
within Klamath National Forest. For probability Imaoes, only area within Forest boundary Is shown. 
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Future Work 

Although the logit classifier appears to have 
some unique advantages over conventional maximum 
likelihood classification; further work will be 
necessary to prove Its value for remote sensing 
applications. Topics to be Investigated include: 

1. Model Specification . For what shapes of non- 
normal distribution are linear models appro- 
priate? Under what conditions are curvilinear 
models necessary? Could a stepwise procedure, 
analogous to polynomial curve fitting, be 
devised for model calibration? Since 1t Is 
possible to obtain assymptotic estimates of 
the standard error of each beta, could the 
stepwise procedure drop Individual terms from 
the model which are not significantly differ- 
ent from zero? 

2. Accuracy . How well are probabilities pre- 
dlcted? Can a confidence limit be placed on 
the predicted probability? Monte Carlo meth- 
ods may be helpful here. How does accuracy 
Interact with distribution shape? 

3. Further Appl 'cations . The logit model needs 
to be exercised on a real classification pro- 
blem and compared with conventional MLC. 

Which Is more accurate? Which consumes more 
computational resources? Do categorical vari- 
ables prssent any special problems? 

These questions and others will be the subject 
of future research In the application of the logit 
model to the remote sensinn problem. 

Table 1. Notation 

Term Definition 


p Number of measurement variables used to 

characterize each object or observation. 

X A p dimensional random vector. 

X. Vector of measurements on p variables 

associated with the tth object or obser- 
vation; t = l ,2,. .. ,n. 

PiX.) Probability that a dimensional random 

vector will take on observed values X^. 

u. Member of the *cth set of classes u; *c*l, 

2 K. 

Pluij,) Probability that an observation will be 

a member of class w, ; prior probability 
of class uj,. 

P{T|u.) Probability density value associated 

with an observation vector X as evalu- 
ated for class 

Probability density value times prior 
probability for observation vector X 
evaluated for class 

u Parametric mean vector associated with 

the l:th class. 

Parametric p by p dispersion (varlance- 
cova^lance) matrix associated with the 
kth class. 


Term Definition 


Dk P by p dispersion matrix associated with 

a sample of observations belonging to the 
kth class; taken as an estimator of 

1 Sumnatlon sign, add together all occur- 

t-i rences of i from 1 to n. 

Product sign, multiply together all occur- 
rences of f from 1 to n. 

§ Estimated vector of regression parameters. 

Mean vector associated with a sample of 
observations belonging to the kth class; 
taken as an estimator of “j,. 
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The loi^it 'jliaaifLer, i computit i onil ly efficient non- 
pinmet r i c pattern cliislfier, <?ive3 higher cliaai ficition 
accuricies thin the Biyea cliaalfier when the diti ire not 
•nul 1 1 vir I ite normal. Biyealin posterior probibl 1 1 ties ire 
il (febnl cil ly obtiinel from minimum llkellhool pinmetera 
fitted to 1 lo(^irlthmic ratio of preiictel probabilities. A 
simple two-cli33 example is: 

P, 

In w- = 6.x 
?. 

where P^^P 2 Is tie ob.iervel oHs ratio of cla.ss ui| over 

class u- ml the Is the vector of calibration const into, 
estimatel by iteriTively maximizing the likellhooi function 
of the molel (7iven the observ"! iata in the vecto" of spec- 
tra! '‘hansels X. The estimatel posterior probabilities of 
class membershTp are al?,ebraical ly l*rivel as 


In a Monte Carlo test on .simulate I nonnormal lata, the 
lo<?it classifier w is significantly jup.-rlor to Bayes, with 
th-* a'‘curacy in'‘rea 3 ei up to VI < in one test. In a test 
with Lanlsat M3J lata, the logit classifier igaln gave sig- 
nificant increa.ses In accura'y, some as high as 5^^*. 

1 . THE BAYS.! CLA.!3IF1SR 

■"ne most commonly usel liscriminant function in remote .sen.sing application.s 
.s the Bayes maximum likelihool classifier Bayes MLC . It .‘alcilites posterior 
probabilities of class membership base! upon parametric estimation of “rcbibll- 
itv lensity fun'tlons for the lifferent ‘Inses anl awarls the cl ass i f i cat ; n to 
■he ‘ategorv with th“ greatest or most "likely" probability. The Bayes ML! li 
Bayes opt. mi!, i.*., given a se-o-one loss fjnction, tne ;ost.5 of m 1 sclass . f i vi- 
*ion are minimizel if tne following two ‘onl .'loni are ne* ; i ‘.j.. probability 
lenii'y fin-’icns ar> multiva" i*e norm ai , anl 2 tn-* T”-i!r pr jbab 1 1 i ' i "S of 
•■a ‘h ‘iasi a-e 'orre'tly spe'.flel. 


Dre !...)• .) • w,.. PI i, : n* •■'n it i on il .’ivmnosiim sn Rem*,. I'ofijing of En- 

V. ronmen’. Ann Arbor, M!. M iv.'hBI. 
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1.1 IH0B:.KM3 rflTH BORDER PIXEL.? 

In tho tr-ilitionil supervised c lass I f I cnt i on 'ippronch, tr-ilnin<? sites nre 
seleotel to be ns pure ns possible so ns to minimize the presence of pixels from 
extrnneous clnsses. This reduces the Intrnclnss virlnnce, nnd becnuae It pro- 
motes the formntlon of tlf?htly defined point swnrms In mensurement space, It 
reduces clns.n overlap nnd confusion. However, In the real world, classes are 
unlikely to be uniform. Even In Inriite n^r I cul turnl nrons, the same crop can 
have a different .apect''al reflectance depending? upon the underlyln* soil *^ype or 
plant moisture content, etc. 

The use of pure training, sites li^nores the spe-tral Information In a sl(?nl- 
fl'^ant number of pixels that can be termed border or mixed pixels which should 
be Included In the Information class. These pixels occur In the tails of the 
distributions, and because they are sot sampled In the tralnlnit sites, the pat- 
tern for the class Is under dispersed with respect to the true Information class. 
This effect produces a difficulty In the classification of mixed pixels. Since 
th" Bayes YLC does not provide for mixed pixels, the implicit assumption is that 
mixed pixels are to be classified accordln;^ to the most probable sli^natire 
mat''h; th" components of the slt^nature which are refle^tel from the less impor- 
tant classes contained within th® plx®! ire thus treated as random noise. The 
mixed pixel Is *vpl:ally "•lasslfied by ;omparlnit probability densities within 
the tails of overlappin<t miltivarlat® normal distributions. 

The • lass i f i ' iM on problem in the slmpl' two--laas univariate situation Is 
solved by csmparinAt th® val les of th® probability density functions. The clas- 
sifi 'a*ion is iwarde! to th® '•lass with the i^reate.st val le or density) at each 
X. The simple two-;lass univariate 1 i s 'r im i nat i on problem with equal prior pro- 
babilities and equal variances is shown in Fi;?ure 1. The Bayes MLC discriminant 
fun'tlon for univariate data defines th® critical decision point is the Inter- 
section point of th® two 'ur/ej. ,’t at i s 1 1 ' I ins wil' recoi?nlze the area to the 
left of th® Intersection point anl under the curve of ui as the area of Type II 
classifl •a*ion error for th® 'lass oi, on the left ind as~ the area of Type I 
error for th® ‘lass on th'.* ri^ht.‘ 

If th® vtriin’e for 'lass u is un Iwrest im it®d anl toe tru- density func- 
tion Is ‘tiV'n by ‘he ®xpanl®l • irv® u in Ki.tJre 2. tn® irea of lndj''l classif- 
l ca* i on <'rror is th-' liff®r®n'e in ar®a b®.w>*n th® tri® anl fie un ler 1 1 sper j» 1 
'irv'S. ?n-> • r 1® Titi'a! 1®''.sion point li ohlft®! to th® left; in tnis exam- 
ple an obs --/-l X ••• ‘a* *r than 'l l bit l®si • h in d* that woild nave be-cn mis clis- 
,i.fi®i inti 'lisi li. lo now •orr-j'tlv •lassifl"! into class • 

1.’ EFESJT IF •; HXAL A PPH IX I'd A" 1 )1{ D*l DATA 

A* this poin*, it be'om-’ - 1 ‘va-it to 'onsld®r th® bias that Is lntrolu'®i 
j <• probabili’y l®nsi*v fin'tlon 1'^ X 'U. ) for 'lass i is not multlvarlite nor- 
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3^yes HLC is "robust" regardini? minor infractions of normality cl^se to the 
mean or mean vector if the Mahalanobis distance 6“^ is fairly large ( 6“ > 4; see 
Zhezhel 1963.) However, as the class mean for m. moves closer to the class mean 
of uj , the probability of error exponentially increases and there are a greater 
numbeF of potential errors because the densest parts of tne distributions are 
being overlapped. 

The special case of Bayes MLC with equal prior probabilities and equal 
covariance matrices, or Fisher's Linear Discriminant Function (LDF), has been 
extensively investigated (see Gupta, 1973). Lachenbruch et al. (1973) and Zhe- 
zhel (1963) have studied the performance of Fisher's LDF on nonnormal data. 
Their results indicated that Fisher's LDF was greatly affected by nonnormality. 
Zhezhel (1963) showed t^at the maximum error, rate is a decreasing function of 
the hahalanobis distance 6' and tends to 0 as 6‘->oo. Thus accuracy for nonnor- 
mal classes whose S'" value is small will be the most severely affected. 

1.3 SUITABILITY OF bnYES 3LC FOR REMOTELY SENSED DATA 

Several authors (Fu et al., 1969: Brooner et al., 1971; Crane et al., 19^2) 
have noted that remotely sensed data were not strictly multivariate normal. 
Most studies of the distributional nature of remotely sensed data have been with 
agriciltural data. Given that most large sample agricultural remotely sensed 
data are nonnormal (the Crane et al . (1972) study showed nonnormality for fields 
averaging over 2,000 data points), it is safe to assume that smaller samples 
would show even greater departures from normality through the inclusion of ran- 
dom noise. The probability of error for Bayes MLC is higher in two separate 
situations: incorrectly specified variance and nonnormal distributions. And 
obviously, the combination of the above two situations will produce even worse 
results. Since the need to establish pure training site statistics can lead to 
problems of underestimation of the variance, and in most renote sensing applica- 
tions the assumption of strict multivariate normality is not valid, Bayes MLC 
with parametric estimation of the probability density functions is not an 
optimal classifier for many remote sensing applications. 

2. THE LOGIT CLASSIFIER 

Most well known predictive statistical techniques are derived from least 
squares regression. Unfortunately, the least sq'.'ares technique assumes that the 
dependent variable is continuously measured. This produces two problems with 
data chat are nominally classified. It can be shown (Domencich and McFadden, 
1975) that the predicted values of the response variable in such a model are 
best inte'^preted as predicted probab i ' ' ties . By definition, probabilities are 
constrained to lie between zero and one. However, the predictions of the least 
squares model are unbounded and may range from minus infinity to plus infinity. 
Consequently , the predictions may lie outside the meaningful range of probaoil- 
ity and may be inconsistent with their probabilistic nature. The other problem 
is related to he ter osced ast ic i t y and will be dealt with later. 

Although there are several statistically acceptable techniques for classi- 
fying nominal data such as the probit, logit, and arotan, the logit model has 
computational advantages since it is a closed (explicit) functional form with 
convonlent curvature properties for numerical optimization. 

2.1 THE LOGISTIC TRANSFORMATI )N 

The term "logit" was coined by Berkson '' ••■i) on analogy of Bliss' use 
(1934) of the term probit, or probability unit, and is a contraction of the 
phra^" "logistic unit." Basel on the logistic curve (see Dom»ncich and McFadden, 
1975) the dichotomous logistic response law can be defined (Book, 19(0) by 
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§,Xi 

e + e ' 


(1) 


where P.. is the logistically derived posterior probability that pixel i belongs 
to class ujj . These posterior probabilities are obtained by taking the logistic 
tr ansfor-nation , 


P P . 

In p“- = > 

i2 ^ '^il - ^ 

“stinating the betas, and substituting these values back into Eq . (1). The 

logistic tr ansfor-nation enables the value of the predicted dependent variable to 
range from minus infinity to plus infinity while the values of the predicted 
probabilities range from zero to one. 

It is relatively straightforward to show that in the mu.-.nomial case, 
i.e., more than two classes, the binomial logit generalizes to 


P. 


ik 


e, X. 

.“-k-i 


K jB X. 

T ^S-l 

2 exp 
s= 1 


k=1 ,2 X 


( 2 ) 


where P.j^ is the posterior probability that pixel i belongs to the kth class, 
given that there .are K classes. 


The logit model is distribution independent. Proof of this property nas 
been given by Day and Xerridge (1967); they show that the logit model includes 
an unknown non-negative scalar function of the observed data. This unknown 
function gives great robustness to the logit model. The only cost is a loss of 
efficiency should the function be known to belong to a specific distribution, 
e.g. multivariate normality. In such a case, Bayes ’dLC will give greater accu- 
racy because it maximizes an unconstrained likelihood function, as compared to 
the logit which is constrained such that the class probabilities sum to 1 and 
the estimated class proportions equal the observed class proportions. 

•dAXIM'J'1 LIKELIHOOD ESTIM/VTION OF P^R.ilHETERS 


There are two commonly used statistical techniques for estimating regres- 
sion parameters: ordinary least squares (OLo) and maximum likelihood. Least 
squares is computationally simpler and the most frequently used. However, even 
though taking the logistic tr ansformation has solved the proble-n of inter pret in g 
the predicted dependent variables as probabilities, there remain problems with 
heterosce d ast ic i ty because the observed probabilities are based upon a nominally 
measured dependent variable. Thus, 0L3 estimated parametcs are not 31 'JE (Best 
Lin°ar Unbiased Est i ■ itor s) . Consequently, maximum likelihood estimation is the 
preferred method. 


The fi-st step is the formul ation of a 1 ikel ihoo d function: 

N n. ! K r, 


A . n --r- 

irril" 


n p 


i:<'k=1 


ik 
i k 


wher-^* N is the total number of observations or pixels, the P.^^ are defined in 
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?’q . , the n, 'iro the totnl nuuber of pixels per observation (nlwsya 1 in the 

sepnr-ite sn-npie reniote sensin*^ nontext', nnl r.. is equnl to 1 if pixel i 
beion<?s to cliss k; 0 otherwise. After droppln^^ the ^ nons t-in t terns nnJ tikln ,5 
the natural lof^nr I thm , the lo<? likelihood function is ^?lven by 


A- 


N 
2 

i=1k=t 


K 

r . , In P. , 

1 k i k 


The second step is iinxlmizin*^ the log likelihood function, oevernl compu- 
titionnl nlgorithns exist for such n purpose; the Newton-Rnphson technique whs 
chosen for this study (see McFndden, It iterntively converges the vector 
of first pHrtlHl derivntlves of the log likelihood function to 0, which by 
definition is h nnximum of the function. Deltns or chnnge Increments -ire cnlcu- 
Inted by inverting the untrix of seconi purtinl derivitlves (the Informstion 
■TiHtrix' nnd postmulti plving this times the vector of first p-irtinla. The deltns 
H’^e ndded to the betns nnl the first pirtinls nni inform-ition mstrix nre recnl- 
'ulntel. If th-* into ire not mul t i -oil i n*’-ir , convergence of the first purtinl 
ier i vnt ivet: to 0 with ' l'»cimHl plnces of •iccuri-'y occurs in less thnn 10 Iteri- 
tions. 


Muximura likelihoo'i estiuntion of the conlitionil logit moie^ cun be shown 
under very gener-il conditions to provide estlmstors thit nre HsymptotiCHlly 
efficient -inl normnlly distributed. These results cnn be used to construct 
HjproximHte Inrge simple confidenc** bounds for the pirimeters . The Inrge simple 
( isymptotid viriin'es -in d stnndird errors nre obtnined from the dligonil terms 
of the inverted m-itrix of second pirti-il der i vit i ves . 

d. MONTE CARLO TESTS ON SIMULATED DATA 

The two models of concern -ire the Biyes MLC nnd the linenr logit clnssif- 
ier. The following Monte C-irlo exper iment-il design wis selected. Three mujor 
i-itn groups or effects -ire gener-ited; (!' univHri-ite normil. (2' slightly 3k**wei 
univnri ite, ml ,d' severely skewed un i vir i ite . Eich group defined i two-wiy 
effect between the M-ih-ilinob i s list-ince between 'itegories -ini tne stinlird 
levi’ition within eich '-iV-gory. For inilyt i c-il simpli'itv. -irbi t rir i ly selected 
Muh-il-inobis distinees of 1.’ ind In were 'ressel with virinnces of 1,9. 'ind 1o. 

Three d x d univiri it » dit i distributions were rei^uired. A’l were desired 
to be univiri'ite normil initinlly, with -ipproprinte ilgebrii-r m in i pul it i ons 
•ilded H p ost er ior i to produce the differences in the menn ml stHni-ird devintion 
ml tKe skew * ~ef f.?c ts . Severnl stHtisti'-il softwire p-ick-iges generit*' r-iniom 
norm'll deviites; the I nte rn it ionil 'I ithemut i ""il nnd .It itisti cil Libriry ,IMSL, 
ri'’0l rfns 'hosen beriuse of its eisy i • cess ib i 1 i t v mi cle'ir document-ition. 

jiven rindom uniform devintes distribut-^i with -i r.-’ro mem ind unit viri- 
nnce, it is e-isy to obt'iin 'iny desire! norm-il distribution by multiplying by the 
desire! st-ind-ir! levi-ition ml aiding the ieuir-’i m'>m. Th'> slightly ik'»w"! 
! i.c t r i but i ons were produce! by aiding th'> vilue of on-^ stiniiri ievi-ition ti 

'll I tn-> obS'*rvit i ons that were l'>ss th in 1 5 from th > mem. The sever'ily 3k'»w * i 
distributions W'»r-> produ'>’i in i oimil ir fashion. If th'» vil )■’ of X was b'»low 
two stindiri deviitions from the m-^m , then wis lidei t) it: if the vil was 
between minus on* to two stmiari i'>viitions below the mein then dj was idled to 
it. -i 1 d*» i to it. 

Ei’h m'lior effe't normal, slightly skewed, -inl severely sk-*wil was geti- 
erited by ••rossing the thre'» s-^l-^'t'*! Mihalanobis distin'es with the tnree 
S'>le't-'d vir i -m '•'‘s . Consequently, ei-v, mi, dor effect was '■imposed of nine dif- 
ferent distributions or 'Inss’s. Th'> skewed m'l.jor eff.s-r.s w're 'r'ltei py 
•ippl.ying the previmsly i ’S'"ib 'i op'-ati ons to only thos'^ ■! iss *s w.th 
Mgniim''bis ,diittn'-'s '’qn! n;n>. ?h i o skewi only * h-> ‘hr’'' ’Iissm w . t 

“c - > md ^ ini lefi th'> 'I'lss’c n''rmil. 
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The results of the test with Lsndsat MSS data are shown in Tables 4 and 5 
(training and test sites are separated). Two facts are easily discerned. (I) 
In all cells except for Bayes' l^-non the classification accuracy is unusually 
high. (2) The accuracy of t’.ie logit was always equal or superior to Bayes. The 
first fact is explained by the accurate location and specification of the train- 
ing sites. Many Landsat classification efforts do not have access to accurate 
"ground truth" maps, and this can introduce random error variance. Also, this 
classification was based on four easily separable classes. The second fact is 
explained by the shape of the training and teat site histograms (please refer to 
Figures 7 and 3). By comparison with the slightly nonnormal histograms of the 
simulated data, many of these histograms are truly skewed and significantly non- 
normal. The two most skewed test classes, urban and lemon, are the classes in 
which Bayes MLC is significantly inferior to the logit. The greatest increase 
in classification accuracy secured in the lemon test sites, an increase of 

5, CONCLUSION 

The logit classifier has been shown to be both theoretically and experimen- 
tally superior to the Bayes MLC with the simulated data anl tne Landsat M33 data 
used in this study. And, as theoretically predicted, the logit classifier was 
inferior to Bayes MLC with normal data when 6'^ was large. 

In order to confidently generalize the above conclusions, further tests are 
needed on simulated normal data with different transformations and on Landsat 
MSS data of greater complexity. Siven the results produced by this preliminary 
study, the logit classifier warrants further investigation as a nonparametr i c 
alternative to the Bayes MLC. 


Another condition besides nonnormality can give rise to biased results for 
both the logit and Bayes classifiers. This condition is known as a violation of 
the Independence of Irrelevant Alternatives (IIA) axiom (MePadden, 1975). 
Briefly, this axiom states: 

the ratio of the probabilities of predicting one alternative over anoth- 
er (where both alternatives have a non-zero probability of occurence^ is 
unaffected by the presence or absence of any alditionai alternatives in 
the choice set (Hensher and Johnson, 1931). 

The existence of the IIA property introduces the ma.jor problem that failure to 
ensure that all the alternatives are equally distinct may lead to biased preiic- 
tions. Both the logit and Bayes classification models violate this assumption. 
Rather than being equally distinct, most classes sought from remotely sensed 
data are hierarchically distinct, as Swain and iauska ' 1 oco , observed. As 
a result, these investigators have developed a Bayesian hierarchical classifier. 
Likewise, similar work needs to be done with the logit classifier. 


Further work on the logit cla.ssifier and its applications with remote sens- 
ing should include: 

( 1 ‘ develooing a "nested" algorithm to handle violations of the IIA axiom; 

developing the quadratic logit: 

(■^'i developing capabilities for using prior probabilities; 

1 converting key nortions of the code to Assembler language, and testing 
the computational costs of the logit and Bayes classifiers; 

S' testing the logit classifier with a wi ie range of applications. 
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The results of the test with Landsat MSS data are shown in Tables 4 and 5 
(trainin/; and test sites are separated). Two facts are easily discerned. (1) 
in all cells except for Bayes' l“iion the classification accuracy is unusually 
high. (2) The accuracy of t'.ie logit was always equal or superior to Bayes. The 
first fact is explained by the accurate location and specification of the train- 
ing sites. Many Landsat classification efforts do not have access to accurate 
"ground truth" maps, and this can introduce random error variance. Also, this 
classification was based on four easily separable classes. The second fact is 
explained by the shape of the training and test site histograms 'please refer to 
Figures 7 and 9). By comparison with the slightly nonnormal histograms of the 
simulated data, many of these histograms are truly skewed and significantly non- 
normal. The two most skewed test classes, urban and lemon, are the classes in 
which Bayes MLC is significantly inferior to the logit. The greatest increase 
in classification accuracy secured in the lemon test sites, an increase of ■59'^- 

5. CONCLUSIDN 

The logit classifier has been shown to be both theoretically and experimen- 
tally superior to the Bayes MLC with the simulated data ani tne Lanlsat M33 data 
used in this study. And, as theoretically prelicted, the logit classifier was 
inferior to Bayes MLC with normal data when 6^ was large. 

In orler to confidently generalize the above conclusions, further tests are 
needed on simulated normal data with different tran.sformations and on Landsat 
M3S data of greater complexity. Civen the results produced by this preliminary 
study, the logit classifier warrants further investigation as a nonparametr ic 
alternative to the Bayes MLC. 

Another condition besides nonnormality can give rise to biased results for 
both the logit and Beyes classifiers. This condition is known as a violation of 
the Independence of Irrelevant Alternatives (IIA) axiom (MePadden, 197J). 
Briefly, this axiom states: 

the ratio of the probabilities of predicting one alternative over anoth- 
er (where both alternatives have a non-zero probability of occurence’' is 
unaffected by the presence or absence of any additional alternatives in 
the choice set (Hensher and Johnson, 1991). 

The existence of the IIA property introduces the major problem that failure to 
ensure that all the alternatives are equally distinct may lead to biased predic- 
tions. Both the logit and Bayes classification models violate this assumption. 
Rather than being equally distinct, most classes sought from remotely sensed 
data are hierarchically distinct, as Swain and iauska 'I9a7'i observed. As 
a result, these investigators have developed a Bayesian hierarchical classifier. 
Likewise, similar work needs to be done with the logit classifier. 
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Figure 5- The nine normal rl i stribut i ons 
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n 

1 2 
1 a^=4 

I 0^=9 i! <j2=i6 

i 

Logit 1 Bayes 

Logit Sayes ITLogit 1 Bayes 

a- o> ^ 

1 .55 1 .55 

; .39 1 .39 

1 .77 1 .77 

[ . 56 1 .56 ' ' . 56 i .55 

; .23 ! .28 j i .40 ! .37 

1- 77 1 . 77 1 1 .70 1 . 76 

Table 1. Comparison with normal data 
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Table 2. CoTipariaon with slightly skewed data 
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//LOGSB JCB (101 1«LI,1«2) .ll, CLASSED 
//POBTBAN PBOC 

//* FOBTBAN EXTENOKO ENHANCED TAS 

//FOBT PXEC PGn=IFEAAB»BSGIONs600K» 

// PABn«*OPTINlZE(3) « NAEE (nAIN<*4) • 

//STPPLIB DO DSN=SYSLB1.FOBTHQ-LOADLIB,DISP=SHB 
//STSPBINT CO SYSOOT=A 
//SYSPONCH DD SYSOUT=B 
//SYSTEBR DD SYSOUT=A 
//SYSUT1 DD UNIT=SYSDA,SPACE= (TRK, (10,10) ), DCB=BLKSIZB=3465 

//SYS0T2 DD ONIT=SYSDA,SPACE=(TEK, (10,10) ) ,DCB=BLKSIZE=5048 

//SYSLIN DO DSN=eeLOAOSET,UNlT=SYSDA,DlSP= (HOD, PASS) , 

// SPACE=(CYL, (2, 1) ) ,DCB=BLKSIZE=3200 

// PFND 

//ASSEHBLE PBOC 

//ASH FXEC PGH=ASHGASn,BEGlON=136K, PAPH=*LOAD, NODECK* 

//STEPLIE DD 0ISP=SUB,DSN=SYSLB1.ASNG. V2L7A.LOADLIB 

//SYSLIB DD CISP=SHB,DSN=SYS1.HACLIB 

// DD DISP=SHB,DSN=UCSB.HACLIB 

//SYSUT1 DD DSN=66SYSUT1,SPACE= (1700, (400,50) ) , 

// UNIT= (SYSDA,SEP=SYSLIB) 

//SYSUT2 DD DSN=&6SYSUT2,SPACE= (1700, (400,50) ) ,UNIT=SYSDA 

//SYSUT3 DD DSN=66SYSUT3,SPACE= (1700, (400,50) ) , 

// UNIT= (SYSDA,SEP= (SYSOTl ,SYSLIB) ) 

//SYSPEINT DD SYSOUI=A 

//SYSPONCH DD SYSCUT=R 

//SYSGO DD DSN=6GLOADSET,DISP= (MOD, PASS) ,ONIT=SYSDA, 

// SPACE= (80, (200,50) ) 

// PEND 

//LNKEDT PRCC PBEC=SINGLE 

//» PLOTTING ADDED 12-80 TAS 

//LKED EXEC PGH = IPHL , PABfl= (H AP , LET , LIST) ,COND= (12,LT) ,BEGION=1 10K 

//SYSLIB DD DISP=SHP,DSN=NYL.BYS927.STRAHLEF.IPL1.SDSLIB 

// CD DISP=SHB, DSN=S¥S1, PLOTLOAD. CALCOHP 

// DD DISP=SHB,DSN^SYS1. PLOTLOAD. OCSB 

// DD DISP=SHfi,DSN=SYSLEl.FOBTUC. FOBTLIE 

// DD DISP=SHB, DSN=SYSLB1.P0FTHEXT.E2L3.F0BTL1B 

// DD D1SP=SHB,DSN=UCSE. FOHTLIB 

// DD DDNAME=ePPEC 

//DOUBLE DD DI SP=SHB, DS N=S YSL B 1 . I MS L, LO ADL I B- DP8 

//SINGLE DD DISP=SHB, DSN=SYSLB1. IHSL. LOADLIB.SP8 

//SYSLHOD DD DISP=SHP, nSN=NYL. BYS9 27. SIEAHLEB. CL ' >SLIB 

//SYSPBINT DD SYSOUT=A 

//SYSOTl DD ONlT=SYSDA,SPACE= (CYL, (5, 1) ) 

//SYSLIN DD DSN=S6LOADSET,DISP= (OLD, DELETE) 

// DD DDNAHE=SYSIN 

// PEND 

//LOGSOB EXEC FORTRAN 

C LOGIT, A VICAR PPOGPAH THAT USES AN ITERATIVE MAXIHUM LIKELIHOOD 
C SOLUTION TO ESTIMATE LOGIT REGRESSION PARAMETERS, THAT CAN BE INPUT 
C TO THE VICAR PHOGPAH PEOBCLAS TO CLASSIFY A DIGITIZED IMAGE. 

C 

C LOGIT, WEITTEN BY PAUL MAYNARD, GRSO, UCSB, SANTA BARBARA, CA. 

C CURRENT VERSION: APRIL, 1981 

C 

C LOGIT CAN HANDLE OP TO 20 CLASSES, WITH A MAXIMUM OF 30 BETAS- 
C THE SECOND JOB RECUIBES 300K REGION. 


COnnON /sv 0AT& (600,20) ,TDAT& (600,20) ,SDATA (600,20) 

COHHON /S2/ VAR(20,50) «in (100) •SUBT(40,20) ,BETA(100) 
connoN /s3/ delta(Ioo) ,pdbr( 100) •icLAssceoo) 

INTEGER PNT,eS,EL,SL«SS,SSl,HDEQ 
1NTEGER*4 PARR (200) ,QPAR (24) ,NCLASS(20) 

INTEGBR*4 KHD(24) /• ITER •,* KCHO* ,• APPR* CBAN* ,• OBAN »,* NCLA • , 
6«TRAI* EOOA* , • HEIG« PRIO* , • SQOD* ,* CONV* , 12 •• •/ 

INTEGEP*4 ClASS/'CLAS*/ • VERT/* VERT V 
BEAL TOTAL(600),CONVBG(4) 

BEAL PRIOR (20) ,PPABH (200) ,THPVCT (100) 

LOGICAL BYTE 
LOGICAL«1 IBUP(7200) 

EQUIVALENCE (I BYTE, BYTE) , (PARR ( 1) , BPAFH ( 1) ) 

RAIN PROG DIRENSIONING — IF CHANGED, CHANGE THESE NURBEBS 
NRD>600 
NC0«20 
NRE-100 

GET PABARETERS, AND SET THE DEFAULTS 
CALI PAPAN (IND, PARR, 200) 

NBOHI=PARR (5) 

NSAHP=PABR (6) ORIGINAL PAGE 13 

KPAR=:PARH (10) OF POOR QUALITV 

CALL KSCAN (PARR, NPAR,KHD,QPAE, 65000) xi-M-m. 

NITER=6 

IF (QPAR (1 ) .EQ.1) NITER-PARH(KHD(1) «2) 

NCOL-0 

IF (QPAR (4) .EQ. 1) NBAN0=PARR (KWD (4) »2) 

ND = 0 

IF (QPAT (5).EQ.1) ND = PAR N (KW D ( 5) ^2) 

NC0LD^Nbf.N0>ND 

NS=NSARP/NCOLD 

NCCL=NBAND»1 

NCCLD=NCOLD»1 

IONE=1 

NC=NCOL 

NCLS = PARR (KUD (6) ^2) 

NCIS1= NCLS-1 
PNT = KND(7) *2 
IOFF=0 
NBCU=0 

DO 2 K=1,NCLS 

2 PRIOR (K) =1.0 
RDEQ=1 

IF (KUD(9) .NE.O) HDEQ=0 
IF (KMD(IO).EQ.O) GO TO 6 
DO 3 K=1,NCLS 

3 PRIOB(K)=RPABR(KHD(10) ♦K^l) 

6 CONTINUE 

CONVRG (1) =1.0 
CONVBG (2)=0.90 
CONVRG (3)=100. 

CONVBG (4)==1.0 

IF (KUD (12) .EQ.O) GO TO 9 

NCGNV=CPAB (12) 

DO 8 I=1,NCONV 

6 CONVBG(I)=FPARH(KUD(12) ♦!♦!) 

9 CONTINUE 


BEAD IN IRAGE, ONE LINE AT A TINE 


14 


15 


16 


SCAN IHBOUSH ENTIRE PABAHETER FIELD, AND PLACE DN*S IN 
DATA IF THEY ARE A *HIT* 

CALL OPEN(NLBL, 2, 0,0, 0,32768) 

DO 50 IROH>1,NBONI 
IPNT«PNT-1 
IRBC^IBOHfNLBL 

CALL BEAO(IND,2,IRBC,0,IOFF, NS ANP, IB0F,0) 

CALL CHECK(IND,2) 

IF (IND.BQ.4) GO TO 5010 ORIGINAL 

IPNT=IPHTt1 - ZJwCd nnts ITY 

IF (IPNT-GE. NPAB) GO TO 50 OF. POOR QUA 

IF (PABH (IPNT) .NE.CLASS) GO TO 15 
IF (PABN(IPNT«2).Eg.VEBT) GO TO 40 
IPNT*IPNT^3 
ICLAS=PABH(IPNT-1) 

CONTINUE 

SL<PABH(1PNT) 

EL»PABH(IPNTt2) «>SL-1 

IF (IROH.GE.SL.OII.IBOH.LE.EL) SO TO 16 

IPNT=IPNT^3 

GO TO 14 

CONTINUE 

DO 22 I==SL,EL 

IF (IBOU.NE.I) GO TO 22 
SS^PABN (IPNT^I) 

ES=PARB (IPNT^3) ♦SS-I 
DO 21 J=SS,BS 
NROU=NBOH»1 
ICLASS(NBOH)=ICLAS 
SS1=J-NS 
DATA (NPOU, 1) =1. 

DO 20 K-2,NCOLD 
SS1=SS1*NS 


BTTE=IBUF (SSI) 

DATA (NROH,K) =FLOAT (I BYTE) 

20 CONTINUE 

21 CONTINUE 

22 CONTINUE 
IPNI=IPNT»3 
GO TO 14 

40 CONTINUE 

50 CONTINUE 
IAFPBX=0 

IF (KUC(11) .EQ.O) GO TO 113 
NADD= (NCOL* (NCOL-1) ) /2 

IF ( ( (NCOLD^NADD) *NCLS) .LE.NFb.AND.KUU (3) .EQ.O) GO TO 63 
WRITE (6,53) 

53 POPBAT (/2X, *TOO LARGE DIBENSION FOE COBPLETE QUADRATIC BODEL, 
1 APPROXIBA1E QUADBATIC MODEL IS USED') 

DO 55 I=1,NCLS 
DO 55 J*1,NADD 

55 VAE (I,J) =0. 

DO 59 IB=1,NROH 
IPNT*0 
K-ICLASS (IR) 

CO 59 I=2,NCOL 
DO 59 J=I,NCOL 
IPNT=IPNT+ 1 

59 VAR (K, IPNT) = V AB ( K , I PNT) *0 AT A (I B, I) *0 AT A ( IP, J) 

IAPfBX=1 


NCGL«NCOL« 1 
MCCLD-*NC0LD»1 
IP (NJ.EQ.O) GO TO 113 
DO 61 IBOHs1,NPOV 

61 DATA(IPOH,NCOLO) «DA1 A (IFOH « NCOL) 

GO TO 113 

63 IIBHCOL«NCOL^NADD 
IP (ND.EQ.0) GO TO 67 
DO 64 lBOi«1,MROB 
DO 64 I«1,ND 

IDOn-I^NCOL 

REUDUH«MEHCOL«I 

64 DATA(IBOH,NEHOUn) -DATA (IEOH, IDUn) 

67 DO 103 IRON<1,MBOU 

MCOL1«NCOL 

DO 103 I«2,MCOL 
DO 103 J>I,NCOL 
HC0L1«NC0LU1 

103 DATA(IBOU,NCOL1) =DAT A (IBOH , I) • DATA (I BOH, J) 
ICOL«NEHCOL 
MCOLD=NSHCOL^ND 
113 NB«MBOH 

IP (NBOH.GT.20) NF»20 
IP (ND.EQ.0) GO TO 625 
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TBAMSFOBH DUfini VARIABLES INTO 1*S AND 0*S 
NCOLI^NCOL^I 
K2 = 0 

DO 510 I>1,MEOW 

DO 510 J-NC0L1,NBB 
TDATA (I,J) ^0. 

510 CONTINUE 

DO 600 J»NCOL1,NCOLD 
DO 520 I=1,NROH 

520 TOTAL(I) <UATA (1, J) 

CALL HAXniN (NFOU,TOTAL,XniN, XnAX,FAUGE) 
NFANGE=IPIX (PANGEA. 1) 

IP ( (NRANGE^NCCL).GT. NRB) GO TO 605 
NVAL^NBANGE^I 
DO 580 I-1,NBOH 
NIIR=IPIX (TOTAL (I) ) 

IF (NUn.EQ.O) NUn=NVAL 
IP (NUa. '^Q.NVAL) GO TO 5BO 
GOTO (531 ,532,533,534,535) ,NUH 

531 ITCOL=MCOL^1 
TDATA(I,ITCOL)*1. 

GO TO 580 

532 ITCOL*NCOL»2 
TUATA (I,ITCOL) =1. 

GO TO 500 

533 I1C0L=NC0L^3 
TDATA(I,ITCOL) =1. 

GO TO 580 

534 ITCOL*HCOL^4 
TDATA(I,ITCOL) *1. 

GO TO 580 

535 ITCOL»NCOL^5 
TDATA (I,ITCOL) =1 

580 CONTINUE 

NCOI=NCOl»NPANGE 


600 

CONTINUE 
GO TO 610 



605 

CONTINUE 

IJ«J-1 

NBITB(6,606) IJ 



606 

POBHAT (• OINSUFPICIENT 

BETA DINEHSIONIHG 

TO CONVEBT ALL DUNHIES',/ 


6* ONLI PIBST *,I2,* 

ABE USED*) 


610 

CONTINUE 
DO 620 I*1,NBON 




DO 620 J*NCOL1,NCOL 


OF POOR QO^;, , 

620 

DATA (I,J) ^TDATA (I, 
CONTINUE 

J) 

625 

CONTINUE 
DO 819 K*1,NCLS 



819 

NCLASS (K) >0 




C CALCULATE TOTAL PIXEL NUHBBBS IN EACH CLASS 


DO 829 I>1,N£ON 
A^ICLASS (1) 

829 NCLASS (K) «NCLASS (K) 

DO 839 K>1,NCLS 

If (NCLASS(K) .LE.O) UBITE (6,843) K 
839 COMTINOE 

843 FOFHAT (//* WABNING: CLASS*, 13,* HAS NO TBAINING P1/:EL,* 

S/ • TOU ABE GOING TO HAVE TBOUBLE*//) 

NB-NCOL*NCLS 

IP (NB.LE. MBB) GO TO 890 

CALL QPBINT (•OINSUFPICIENT BETA OINKNSIONI NG* , 31) 

GC 10 6000 
890 CONTINUE 

DO 900 I<1,NB 
XN(I)<0. 

900 EETA(I)-0. 

NB1-NC0L*NCLS1 
DO 903 Is1,NCLS 

IPT= (1-1) *NCOL^1 
903 XB (IPI) = 1- 

00 906 IB0U«1,hP0U 

IPT-* (ICLASS (IPOH) - 1) *NC0L^1 

DC 906 1=2,NC0L 

IPT*IPT^1 

906 XH (IPT)>XB (IPT) ♦DATA(1B0H,1) 

DO 913 I«1,NCLS 
IPT= (I-1)*NC0L»1 
DO 913 J«2,NC0L 
IPT*IPT»1 

913 XH (IPT)=XH(IPT)/NCIASS(I) 

IP (lAPPBX.BQ.O) 30 TO 947 
DO 916 K>1,NCLS 
KPT= (K-1 ) •NCOL^I 
IVPT*0 

LO 916 J = 1,N'.AND 
JPT=KPT*J 
DO 916 L>J,NBAND 
IPT«KPT^I 
IVPT=IVPT*1 

916 VAN (K, IVPT) *VAB (K,IVPT) /NCLASS (K) -XB (LPT) *XH (JPT) 

DO 921 K^1,NCLS 

DO 917 I=1,NADD 

917 THPVCT (l)aVAB (K,I) 

CALL CONVBT (THPVCT, N ADD, Nd AND, BCD) 


910 

921 


923 

929 


9930 


931 


939 


941 
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DO 910 I=1,NliDD 
VAF(K,I)=TnPVCT(I)/100. 

CONTINUE 
DO 929 K=1,NCLS 
DO 923 I^UKADO 
VAB(K,I)>74E(K,I)-VAF(NCLS«I) 

CONTINUE 

DO 99930 K-1,NCLS 
KPT=K*NCOL 
in (KPT)=0. 

DO 939 IB^IfNBOH 
DO 939 K=1,NCLS 
SDATA(IB,K) «0. 

IVPT=0 

DO 931 J=2,NC 
DO 931 I=J,NC 
IVPT=IVPT^1 

SDATA(1R,K)=S0ATI(IR,K) ♦DATA(IR,J) •DATA (IB, I) •VAB(K,1VPT) 
IF (K.ME.ICLASS(IR) ) GO TO 939 
DATA (IB, NCOL)=SOATA(Ifi,K) 

KNCL-K«NCOL 

IN(KNCL) -XN(KNCL) •S0ATA(IB,K) 

CONTINUE 
DO 941 K>1,NCLS1 
KNCL-K*NCOL 

in (KNCL) -in (KNCL) /NCI ASS (K) 


ECHO BACK INPUT DATA 
947 NB1=NE 

IF (KUD(2) .NE.O) NE1=NB0V 

IF (QPAP (2) .EQ. 1) NRI^PABn (KWO (2) ^2) 

IF (ND.EQ.O) GO TO 750 
HRITE(6,700) NCOL 

700 FOBnAT(* lINPUT DATA, HITH CONTINUOUS CHANNELS IN FIRST *,12, • COLU 
&nN GROUPS*, /«' FOLLOHBO BY TBANSFOFNED CATEGORICAL DATA*) 

DO 710 1=1, NR1 

710 NBITE(6,715) ICLASS (I) , (DATA (I,J) , J=1 , NCOL) 

715 FOBHAT(* CLASS* ,13, 2X,20F6-0) 

GO TO 800 

750 CONTINUE 

BPITE (6,770) 

770 FOPnAT (* 1INPUT DATA*) 
no 700 1=1, NR1 

780 NBITE(6,715) ICLASS (I) , (DAT A (I, J) , J= 1 , NCOL) 

IF (NP 1. EQ.NBOWt GO 10 800 
IBITB (6,1361) 

1361 F0En.\T(* THE LAST OBSERVATION IS:*) 

HRITE(6,715) ICLASS (NBOW) , (DAT A ( NROH , J) ,J=1,NC0L) 

80C LCNTINUE 

WRITE (6,1362) NPOU 

1362 FOPnAT(* TOTAL NUnBER OF OBSERVATIONS =*,I4) 

CALCULATE PROPORTION TOTALS 
DO 810 K=1,NCLS1 
CO E09 J=l,NCOL 
sum (J,K) =0. 

DC 605 I=1,NROW 

TK = 0. 

IF (ICLASS (I) .RQ.K) TK=1. 

SUBT (J,K) = SUBT (J,K) ♦DATA (I,J) ♦TK 


n n n on non 


60S CCNTIMUE 

809 CONTINUE 

810 CONTINUE 
C 

C CALCULATE DEGREES OP PREEDOH AND NBOUS OP OUTPUT PROPORTIONS 
NOF=NBOU*NCLS1-NCOL*NCLS1 


START OF NAIH LOOP 


950 

953 


904 

905 

911 


HISPRE=2*NROi 
DO 1000 n«1, NITER 
HRITE(6,950) n 
PORHAT (• OITBRATION • ,12) 

CALL LOGIT (NROU, NCOL, NCLS, fllSPRE, 
GNRC,NCO,NRB,IER,CONVBG,IAPPRX, NADD,HIS) 

IF (lEB.EQ. 129) GO TO 1060 
IP (IEB.NE.34) GO TO 911 
URITE (6,905) 

PORHATC INVERSE ACCURACY SUCCESSFULLY INPBOVED*) 

GO TO 1075 

CONTINUE 
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OUTPUT ITERATION VALUES 
DO 980 K=1,NB1 

BETA (K) = BETA (K) »0ELTA (K) 

IP (KUD(2) .NE.O) HRITE(6,975) BETA (K) , FDEE (K) , DELTA (K) 

75 POBHAT(* BETA •,P10.4,« FDEB •,F12.4,» DELTA *,2F10.4) 
80 CONTINUE 
000 CONTINUE 
niSPRE=0 

CALL LOGIT (NBOU , NCOL , NCLS, HISP BE, 

6 NEC, NCD, NPB,IBB,CONVRG,IAPPBX, NADD,niS) 


GO TO 1075 

1060 WRITE (6,1061) 

1061 FOEHATC* IKPORflATION liAlBIX SINGULAR',/, 
BIS=100 

1059 DO 1058 1=1, NB1 

1058 BE7A(I)=BETA(I)-DELTA(I) 

1075 CONTINUE 


ITERATION HALTED') 


WRITE CUT RESULTS 
WRITE (6,1100) 

, 1100 EOPBAT {• 1FINAL ESTIMATION OF PAEANETEBS • ,/, ' TUB FIRST SUBSCRIPT 

SFOLLOWING THE BETA BBFEPS TO THE CLASS NUNBEB',/,' THE SECOND SOB 
eSCEIPT EEPEBS TO THE PARABETEP NUHDEE') 

KNT = 0 

DO 1200 K=1,i:CLS1 

IP (PRIOR(NCLS)-EQ.I.) GO TO 1129 
PEIOP (K) =PRIOF (K) /PRIOR (NCLS) 

1129 ADJ=NCLASS (NCLS) *PPIOB (K)/NCLAbJ (K) 

AOJ=NDEO*ALOG(ADJ) 

K2=0 

DO 1200 J=1,.JCOL 
K2=K2»1 
KNT=KNT^1 

IF (lAPPRX.EU.O.OB. J. NE. NCOL) GOTO 1133 
DC 1131 I=1,NADD 

VAB(K,I)=VAP(K,I)*EETA(KNT) 

TEHP=VAR (K,I) •DELTA (KNT) 


1131 


JPI*J»I-1 

HBITE (6,1150) K,JPI,VAB (K,I) ,TEnP,ZN(KMT) ,BBTA(KNT) 

GO TO 1200 

1133 IF (J.EQ.1) BETA (KNT)^BETA (KNT) ♦ADJ 

HEITE(6, 1150) K,K2,BETA (KMT) ,DELI A (KMT) , XH (KMT) 

1150 POENATC BETA(*,I2,»,*#I2,*)=*,F12,6, 

S • DBLTA=« ,F9.5,* INPOT BBiN=*,Pl0.3,F9.5) 

1200 CONTINUE 

HFITE(6, 1500) BIS 

1500 POBBAT(*OPBEDICTIED PROBABILITIES OF BISCL ASS-PIXELS *, 15) 

IF (BIS.6E.80) NBOH=80 
DO 2000 I=1,MBOH 

IF (TOATA(I,ICLASS(I) ) .GT.0.5) GO TO 2000 
SBITE(6,1600) I,ICLASS (I) , (TDATA(I,J2) ,J2=1,NCLS) 

1600 FOBBAT(2I4,20F6.2) 

2000 CONTINUE 
2000 COMTINOE 
CALL END 

5000 CALL QPBINT ('OPABABETEB EBB0P*,16) 

GO TO 6000 ORIGINAL PAG" 

5010 CALL CPBIMT (• OONEXPECTED EOF*, 14) OF POOR QUALITY 

6000 CALL QPRINT(* EXECUTION TE6B1N ATFD* , 21) 

CALL ABEND 

BETUBN 

END 

SOBPOUTINE COMVBT (X,B,M,NP) 

BEAL X(B) ,5(20,20) , SIN Y (20,20) , IK (1000) 

IDGT=1 

BPT=0 

DO 100 1=1, N 
DC 100 J=I,N 
BPT=BPT*1 
S (I, J) =X (BPT) 

100 S(J,I) =S (I,J) 

CALI LINV2F (S,N,Nfi,SINV,IDGT,UK,IEP) 

BPT=0 

DO 200 1=1, N 
DO 200 J=I,N 
BPT=BPT»1 

200 X (BPT) =SINV (I, J) ♦ (2-I/J) 

PETUPN 

END 

SUEBOUTINR LOGIT (NEON, NCOL, NCLS, HISPPF, 
6NPC,NCD,NBB,IFB,CONVBG,IAPPBX, NADD,B1S) 

COBBON /SI/ DATA (600,20) ,TDATA (600,20) ,SDATA (600,20) 

COBBQN /S2/ VAP(20,50) ,XR(100) ,SUBT(40,20) ,BETA(100) 

COBBON /S3/ DELTA (100) ,FDEB (100) ,ICLAbS (600) 

PEAL FDNFW (100) ,DLTNEU (100) ,CONVBG(4) 

IF DIBENSIONING IS CHANGED IN B/PBOG, CHANGE THESE ALSO 
PEAL TOUT(20) ,Q(40,20) ,UT(20,20) ,KBON(40,40) 

BEAL SS (100,100) ,VKABEA (12000) 

INTEGEP NPNT(IOO) 

NCLS1=NCLS-1 
NB1=NC0L*NCLS1 
NCKC=NCOL-IAPPPX 
NB = NBUNCOL 
IONE=1 

DO 5 l=1,NCOL 
DO 5 J=1 ,NCLS 


5 Q(I,J)=0. 

DO 25 Js1,NB 
DO 25 J2-1,NB 
SS(J,J2)-0. 

25 COMTIMDE 
C 

C CBEATE LOGIT QUOTIENT 
APPONE=1.-0. IE-30 
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DO 500 11*1«HB0H 
SUH-0. 

TEHP=1. 

HODO?F=0 

DO 100 K==1vNCLSl 
OUT=0. 

K1=K-1 
NP2-K1*NC0L 
DC 90 J=1,NC0L 
DSUn^DATA (II, J) 

NP2=NP2^1 

IF(IAPPBX.BQ. 1.AND.J.EQ. NCOL) DSUH=SDATA (II, K) 
80 OaT=OUT*DSUR*BBTA (NP2) 

90 CONTINUE 

91 OOT=OUT*NODOVF 

If (OUT. GT. 170.) OUT=170. 

IF (OUT.LT.-170.) OUT=-170. 

TCJr(K)=EXP(00T) 

IF ( (7.0L75-S0M)-GE.TO0T(K)) GO TO 96 
93 HODO?F=MODOVF-100 

SUn=S0n/2.688lE43 
TEHP=TT;hP/2.6881E43 
GO TO 91 

96 SUfl=SUH^TOOT(K) 

100 CONTINUE 

TOOT (NCLS) =TEHP 
SUP=SON»TFNP 
SBALL=SON*0. IE-30 
DO 150 K=1,NCLS 

IF (TOUT (K) .LT. SHALL) TOUT (K) =SHALL 
TDATA (II ,K) =TOUT (K)/S0H 

IF (TDATA(I1,K) .GT.APPONE) TDATA (I 1 , K) =APPONE 
DO 148 J=1,NCOL 

0SUn=DATA (II, J) 

IF (I APPBZ.EQ. 1. AND. J.EQ. NCOL) DSUH^S DATA (1 1 , K) 
Q (J»K) =Q (J»K) ♦TDATA (I1,K) ♦DSUH 
14« CONTINUE 
150 CONTINUE 


C 

C CBEATF HEIGHT HATBIX (THE INFOBHATION HATPIX) 

DO 200 K=1,NCLS1 
CO 200 K2=1,NCLS1 

HT (K,K2) =-TDATA(I1,K)*TDATA(I1,K2) 

IF (K.EQ.K2) HT(K,K2)=TDATA(I1,K)*(1.-TDATA(I1,K)) 
200 CONTINUE 


C 

C CBEATE KBONECKEB HATBIX (THE OTHEB COHPONFNT OF INFO HATBIX) 
DO 250 I=1,NCKC 
DO 250 J=1,NCKC 

KFON (I,J) =DATA (11,1) ♦DATA (11,J) 

250 CONTINUE 


C 


CBEATE *NBCU* LAYFBS OF INFOBHATION NATBIX 
NBC=-NCOL 

DO 3U0 NBH-1,NCLS1 
MBg-tiBQ*NCOL 
HCQ=-NCOL 

DO 300 MCH-1,IICLS1 
NCQ=NCQ»NCOL 
NB=NBQ 

DO 280 NBK-1«MC0L 
MB=NB»1 
NC=NCQ 

DO 280 NCK=1,NCOL 
MC=NC^1 

IF (IAPPBX.ME.1) GO TO 270 
IP (M6K.BE.MCOL.AND.NCK.NE.NCOL) GO TO 270 
IF (NBK-NCK) 261,263,265 

261 TEHP=UT (NBU,NCU) *DATA ( 1 1 , NBK) *SD AT A (1 1 , NCH) 

GO TO 280 

263 TBHP=UT(NBU,NCU) *SDATA (II, NEB) *SDATA (I1,NCU) 

GO TO 280 

265 TEHP=HI(NBU,NCU) *SDATA (I1,NBH) «DAT A (II , NCK) 

GO TO 280 

270 TEHP=HI (NBH,NCV) *KBON (NBK,NCK) 

?80 SS (NB,NC) =SS (Ua,NC) 

)00 CONTINUE 

>00 CONTINUE 

GOODNESS OF FIT TEST 
HIS=0 

DO 11990 I=1,NBOH 

IF (TDATA(I,ICLASS(I)). LB-0.5) HIS=MIS>1 
1990 CONTINUE 

«BITE(6, 11995) BIS 

1995 FOPBAT(» BISCLASSIFIED PIXEL BEFORE ITEB ATION= * , 14) 
IF (BIS.GE.niSPBE) GO TO 91056 
niSPBE=MIS*CONVBG (2) 

GO TO 91000 
1056 IEE=34 

GO TO 2000 
1000 CONTINUE 

CALCULATF THE FIRST DERIVATIVE 
K2 = 0 

DO 651 K=1,NCLS1 
DO 650 J=1,NCOL 
K2 = K2^ 1 

PDEB (K2)=SUET(J,K) -Q (J,K) 
i50 CONTINUE 
651 CONTINUE 
NBB1=NB1-1 
DO 900 I=1,NBH1 
IPl=It1 
697 B=I 

CO 700 J=IP1,NB1 

IF (ABS (SS (N,I) ) -LT. ABS (SS (J,I) ) ) H=J 
700 CONTINUE 

IF(H-EQ-I) GO TO 707 
DO 705 J=I,NB1 
TEMP = SS (I, J) 



705 


SS (I,J)^SS (H, J) 
SS (H,J) =T1»MP 
TEnP = FDRB (I) 

FDEE (I) -FDER (H) 
FDEE (H)=TENP 
707 CONTINUE 
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H=I 

CO 709 J=1P1,NB1 

IF (ABS(SS(1«H)).LT.ABS(SS(1«J)) ) H-J 
709 CONTINUE 

IF (ABS(SS(I«I)).LE. ABS(SS(1«R)) *CONVRG(i) ) GO TO 800 
IF (ABS(SS(I,I)) .LE.CQNVRG(I)) GO TO 760 
IF (ABS(SS(I,I)).LE.ABS(FDEB(I)«CONVPG(4))) GO TO 770 
FDER (I) =FDEF (I) /SS (1,1) 

SDEl=SS(I,I) 

DO 720 J=I,NB1 
720 SS(1,J)=SS(I,J)/SDEL 
DO 750 J=IP1,NB1 
SDEL=SS (J,I) 

DO 740 K=I,NB1 

740 SS(J,K)=SS(J,K)-SS(I,K)«SDEL 

750 FDEF (J)=FDEP(J)-FDER(I)*SDEL 
NPNT (I) = 1 
GO TO 900 

760 WRITE (6,761) I 

761 F0BHAT(I4,» SSII TOO SHALL*) 

GO TO 800 

770 HBITE (6,771) I 

771 F0RflAT(I4,* DELTA TOC LARGE*) 

800 NPNT(I)=0 

FDER (I) =0. 

900 CONTINUE 

NPKT (NB1)=0 

IF(ABS(SS(NB1,NB1) ) .L£.ABS(fDEP(NBl) «C0NVEG(4) )) GO TO 920 
FDER (NB1)=FDEB (NB1)/SS (NEl ,NB1) 

SS (MB1,NB1) =1. 

NPNT (NB1) = 1 
920 DO 95C 1=2, NB1 

IF (NPNT (I) .EQ.O) GO TO 950 


ini=i-i 


DC 940 J=1,inl 

IF (NPNT (J) .Eg. 0) GO TO 940 
FDEE (J) =FDBB (J) -FDEE (I) *SS (J, I) 
SDEL=SS (J,I) 

DC 930 K=I,NB1 

IF (NPNT (K) .Eg. 0) GO TO 930 
SS (J,K) =SS (J,K) -SS (I,K) *3 DEL 
930 CONTINUE 
940 CONTINUE 
950 CONTINUE 

WRITE (6,1000) (FDEB(I) ,I=1,NB1) 
1000 FOFHAT (1X,10F10.4) 

WRITE (6,1020) (NPNT (I) ,I=1,NB1) 
1020 FOBnAT(30l3) 

997 DO 999 1=1, NB1 
DELIA (I) =0. 

IF (NPNT (I) .Eg. 0) GO TC 999 
DELTA (I) =FDER (I) 

999 CONTINUE 
2000 COMINUE 


RETURN 

END 

SUBROUTINE nJLXHl N ( N, ARB AT » XHIN , XniX, RANGE) 
DIflENSlON AFRAY(N) 

XHINxABRAT (1) 

XHAXsABBAT (1) 

DO 100 I«1«N 

IF (XNIN.GT. ARRAY (I) ) X(11N=ABRAY (I) 

IF (XHAX.IT. ABBAY(D) XHAX^ABB AY (1) 

100 CONTINUE 

BANGE<XHAX-XHIN 
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RETURN 

END 

/ EXEC LNKEDT 
INCLUDE STSLIB(LVNIC) 

ENTRY LYniO 
NAHF LCGPB2(B) 

/ EXEC VICAB 

•LOG2 JOB (IOIIrII, 2) ,11, CLASSED 
•PASSHOBD Z 

♦BOUTS SAVE TEHP01 H YL. QCV769. LI. BJE. LOGI 

EGIONr450K 

1NDrOX24 

•LOGPB2,OX24,«, ,P1 

,P1 

BAN»8 

TEB.4 

CLA, 10 

ONV, 30. J. 0,0. 0001 ,0.3 

QOD 

EIG 

BAIN 

LASS,1, 7,233,3.10 5,99.5,6 116,186,5,6 51,177.2.5 

LASS. 2. 91,163,4.5 23.151,4,10 

LASS, 3, 119,85,2,4 101,81.2,5 123,46,5,2 126,161,4,5 

LASS, 4. 02,139,2,5 54,91,2,5 97,177,4,2 91,203,1,5 

LASS. 10, 72.178,2,7 34,136,2,5 

LASS, 6, 1 19, 139,3,5 24,211,3,5 

LASS, 7. 56,111,2,5 116,125,3,7 

LASS, 8. 100, 219, 1,8 40, 111, 1, 10 98,31, 1,8 61.58.2,4 

LASS, 9, 25,101,2,10 113.111,2,5 6,112,1,5 

LASS, 5. 4,84,2,5 19,124,2,10 122,203,1,5 72,88,2,5 

ND 

/YTP2.SYSIN UD * 

X24:=UTL.0CV76 9. LI.HSZHAN2 

/ 




VICAP PROG PBCLSQ 


VPBOBSOB job (4530,LI, 1,2) ,II,CIASS=D 

V fort PA II PBOC ORIGIMAL i^mGc IS 

V* PORTRAM EITEHDBD ENHANCED TAS OF POOR QH A "TV 

VFOBT exec PGH=IPEAAB,BEGION=600K, 

V PARH»*0PT1N1ZE(3) ,NAHE (HAIN44) • 

VSTEPLIB DD OSN^STSLB1.FOBTHQ.LOADLIB,D1SP=SUE 
VSYSPBINT DD SISOOT=A 

VSrSPONCH DD STSOOT=B 
VSYSTEBB DD SYSOUT=A 

VSYSOTl DD UNIT=SYSDA,SPACE=(TRK, (10,10) ) ,DCB=BLKSIZE=3465 

VSYS0T2 DD aNIT=SYSDA,SPACE= (TBK , (10, 1 0) ) , DCB=BLKSIZE=5048 

ySYSLIN DD DSR^&6LOADSET,ON11=SYSDA,DISP= (HOD, PASS) , 

V SPACE=(CYL, (2,1) ) ,DCB=BIKSIZE=3200 
7 PEND 

7ASSENBLE PBOC 

7ASH EXEC PGn==ASNGASH,EEGION=136K,PABH=*LOAD, NODECK* 

7STEPLIB DD DISP^SHB, DSN-SYSLBI • ASHG. V2L7 A. LO ADLl B 

7SYSIIB DD DISP=SHR,DSN=SYS1.HACLIB 
7 DD DISP=SHB,DSN=UCSB.HACLIB 

7SXSUT1 DD DSN=66SYSUT1,SPACE= (1700, (400,50) ) , 

7 0NIT= (SYSDA,SEP=SYSLIE) 

7SYS0T2 DD DSN=66SYSUT2,SPACE^ (1700, (400,50) ) ,UN1T=SYSDA 

7SYS0T3 DD DSN=66SYSUT3,SPACB= (1700, (400,50) ) , 

/ 0NIT= (SYSDA,SEP= (SYSUT1 ,SYSLIB) ) 

ySYSPHINT DD SYSCUT=A 
ySYSPUNCH DD SYSOOT=B 

ySYSGO DD DSN=6£LOADSET,DISP- (HOD, PASS) ,UNIT=SYSDA, 

y SPACE=(80, (200,50) ) 

y PEND 

/LNKEDT PBOC PREC=SINGLE 

/♦ PLOTTING ADDED 12-80 TAS 

/LKED EXEC PGH = IEHL, PABH= (M AP, LET , LIST) ,CCND= (12, LT) ,REGiON=110K 

ySYSLIB DD DISP=SHF,DSN=HrL. BTS927.STBAHLEE,IP11.SDSLIB 

/ DD DISP=SHB,DSN=SYS1-PLOTLOAD.CALCOHP 

/ DD DISP=SHB,DSN=SYS1.P10TL0AD. UCSB 

/ DD DISP=SHE,DSN-SYSLBl.POFTHg.FOPTtIB 

/ DD DISP=SHR,DSN=SYSLBUFOPTHFXT. R213.FOBTLIB 

/ DD DISP=SHF,DSN=UCSB.F0RTL1B 

/ DC DDNAME=6PBEC 

/DOUBLE DD DISP=SHR, DSN=S YSLB 1 . I SSL. LO ADLIB. DP8 
/SINGLE DD D1SP=SHP,DSN=SYSLB1.IHSL.10ADLIB.SPR 

/SYSLHOD DD D ISP=SHB, DSN = fci YL. BYS9 27. STB ABLER. CLASSLIB 
/SYSPBINT DD SYSOUT=A 

/SYSUT1 DD UNIT=SYSDA,SPACE= (CYL, (5, 1) ) 

/SYSLIN DD DSN=66LOADSET, DISP= (OLD, DELFTE) 

/ DC DDNANE=SYSIN 

/ PEND 

/PROHSOP EXEC FORTRAN 

PR08CLAS, A VICAB PPOGPAS THAT TAKES PAPAHETEPS FPOH VICAR PBOGRAH 
LOGIT, AND GENERATES A CLASSIFIED IHAGE BASED ON CALCULATED 
PROBABllITIES. 

WRITTEN BY PAUL NAYNAPD, GHSU, UCSB, SANTA BARBARA, CA. APRIL 1980 
PROBCLAS CAN HANDLE 10 CLASSES AND 20 PAPAHETEPS PER CLASS 


INTEGEP*4 PARH (600) ,CPAP (24) 



IliIEGEB*4 KHD(24) /‘MSS UCLA* BETA* , *QOAD* r20*« •/ 

INTEGEP SL.SS 

LOGICAL*! IBUP (7200) «OBaE(7200) 

LOGICAL BYTE^OfEB 

BEAL*4 BPABH(600) , BETA (500) , DATA (50) ,PBOB(10) «BBUP(7200) ,TOUT(10) 
EQUIVALENCE (PABH (1) .BPABH (1)) , (IHOBD.BYTE) 

IDIIi*50 
10NE>1 

CALL PARAH (IMD, PABH, 600) 

SL>PABH(1) 

SS^PABH(2) 

NL»PABH(3) 

NS=PA6H(4) 

NBOHIsPABH (5) 

NSAHPI-PABH (6) 

NPAB=PABtl(10) 

CALL KSCAN (PABH, NPAB,KHD,QPAB, £5000) 

NBAND^I 

IP (QPAB(1) .EQ.1) NBAND-PABH (KHD(1) ^2) 

NSAHP=NSANPI/HBAND 
IP (NS.EQ. NSANPI) NS=NSAflP 
NC-PABn(KHD(2) *2) 

NB-NBAND4-1 
IOFP=0 

1CNT = KU0 (3) ♦! 

NAOO-NB 

IP (KHD (4) .NE.O) MADD=NB«(NB*NBAND) /2 
NBT=NADD*NC 

DO 10 1=1, NBT 
ICNT=1CNT*1 
BETA (I) =BPABH (ICNT) 

CONTINUE 

CALL OPEN(NLBL, 2, 0,1, 0,32766) 

CALL LABELS (I ND, NLRO, 1 , 0 , 0 , 0, 0, NS, OBUP , 0,32768) 

IBPC=NLBL»SL-1 
DO 100 IBOH=1,NL 
IBBC=IBEC»1 

CALL BEAD(IND,2,IBBC,0,IOFP,NSANPI,1BUF,0) 

IF (IND.EQ.4) GO TO 5010 
DO 15 I=1,NSAn?I 
BTTB=IB0P(I) 

BEUF(I) =FLOAT(IUOBD) 

CONTINUE 
DATA (1) = 1- 
DC 50 J=1,NS 

INC=SS^J- (NSAHP*!) 

DC 20 I=1,NBAND 
I2=I^1 

INC=INC^NSAF1P 
DATA (12) =EBaP(INC) 

CONTINUE 
DO 30 K=1,NC 
OUT=0- 
K1=K-1 

NSTBT»K1*NADD 
DO 25 J2=1,NB 
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25 


23 

26 


35 


NSTPT*NSTBT»1 

OUT=OOT^DATA (J2) ♦BETA (NSTBT) 
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DO 35 1=2, HC 
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IPNT*I 
CONTINUE 
IUOBD=IPNT 
OeUF(J) =BTTE 
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ABSTRACT 


Soil Loss Prediction in a Geographic 
Information System Format 

by 

Michael Alan Spanner 

Soil loss due to erosion from rainfall was successfully pre- 
dicted for the Santa Paula 7.5 minute quadrangle, Ventura County, 
California, utilizing the VICAR/IBIS image processing and geographic 
information system to simulate the Universal Soil Loss Equation 
(USLE). This work was part of a NASA funded research project inves- 
tigating methods of incorporating collateral information in Landsat 
classification and modelling procedures (NSG-2377). Representing 
the rainfall, soil erodability, length of slope, slope gradient, 
crop managemant and soil loss tolerance coefficients of the USLE 
were data planes generated from digital Landsat MSS data, USGS 
Digital Elevation Model topographic data, a digitized NOAA isopluvial 
map and digitized USDA Soil Conservation Service soil maps. The 
Pearson product moment correlation coefficient, R, of the soil loss 
values obtained from the developed geobased model to a sample of 
manually derived soil losses was .91 after a log transform, signifi- 
cant to the .0001 level. The system accurately targeted soil loss 
problem areas for subsequent analysis by Soil Conservation Service 
personnel. 



Estimates of accuracy for the Intermediate data planes repre- 
senting rainfall, soil erodability, length of slope, slope gradient, 
crop management and soil loss tolerance ranged from .81 to 100 per- 
cent. These intermediate data planes were accurate and flexible in 
their representation of environmental information. Incorporating 
slope and elevation information in the land use/land cover classifi- 
cation considerably improved classification accuracies. Watershed 
characteristics including slope and length of slope were effectively 
depicted using the DEM topographic data. Digitized soil maps were 
a powerful information source, allowing soil erodability and soil 
loss tolerance to be represented in an image format. Useful 
statistics from the soil maps, topographic data and land cover were 
obtained for this watershed. Maps and statistics produced from this 
soil loss information system will be of great assistance to resource 
managers dealing with the problem of soil loss in agricultural areas. 
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CHAPTER 1 
INTRODUCTION 


The purpose of this research is to demonstrate the potential 
of Landsat and collateral information in a geographic information 
system (GIS) format to simulate the Universal Soil Loss Equation 
(USLE). The Universal Soil Loss Equation, developed by the United 
States Department of Agriculture (USDA) Soil Conservation Service 
(SCS), predicts soil loss due to rainfall in agricultural regions 
(Wischmeier and Smith, 1965). Soil loss caused by erosion from 
rainfall is a serious problem in the United States. The Pacific 
Southwest Inter-Agency Committee (PSIAC) estimated in 1971 that 
sixty-five percent of our agricultural regions require some form of 
erosion control. Portions of Ventura County are rated as moderate 
to severe in erosion hazard by the PSIAC. 

Variables of the Universal Soil Loss Equation are coefficients 
of rainfall, soil erodability, length of slope, slope gradient, 
crop management and conservation practice. The collection of these 
variables for the USLE is accomplished by Soi ! Conservation Service 
personnel on a per site basis. From tables and charts (USDA 
Science and Education Administration, 1978), as well as measurements 
obtained in the field, a prediction of soil loss is calculated for 
an area in terms of tons of soil loss per acre per year. These 
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sites range from 1-10 acres depending on the homogeneity of the 
location. A large area determination of soil loss is not possible, 
due to the nature of the USLE. The input of the Universal Soil 
Loss Equation to the Video Image Communication and Retrieval Image 
Based Information System (VICAR/IBIS) allowed a much more extensive 
area to be assessed for erosion potential. Each site was repre- 
sented on a georeferenced grid. Variables from the USLE for the 
Santa Paula 7.5 minute quadrangle in southern Ventura County 
(Figure 1) were input to VICAR/IBIS, deriving a prediction of soil 
loss in tons per acre per year for this mixed agricultural and 
rangeland location. 

Increasing accumulations of sediment are being deposited in 
Mugu lagoon, a terminus of drainage for the Santa Paula 7.5 minute 
quadrangle (Figure 2). The siltation of Mugu lagoon is due largely 
to the poor soil conservation practices, sparse grass cover and 
frequent fire in the upper portions of the watershed. New avocado 
orchards are being introduced in the canyons extending into the 
steeper mountainous areas. The sparse ground cover provided by the 
iirmature avocados, in conjunction with the steep slopes, creates a 
potential soil loss problem. Mediterranean annual grasses growing 
in some of the mountainous areas inadequately protect the soil from 
rainfall induced erosion. Two fires have occurred in the South 
Mountain area in the past two years. Removal of the natural vege- 
tative cover by fire greatly increases soil erosion rates. Major 
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Figure 1 Location of the study area for this Master's Thesis 
Research, che Santa Paula 7.5 minute guadranqle, 
Ventura County, California. 
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Figure 2 This map shows the location of the major watershed 
features discusseo within the text. 
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planning and construction efforts are necessary to accomodate the 
increase in sediment load. A prediction of erosion rates will sub- 
stantially aid such agencies as the USDA Soil Conservation Service, 
the U.S. Army Corps of Engineers, the California Coastal Commission, 
the Ventura County Water Agency and the U.S. Navy facility at Point 
Mugu for this watershed alone. Implementation of this soil loss 
model to other locations will assist a wide variety of interests 
including conservationists, growers, planners, and developers. 

Digital image processing for the soil loss study was performed 
on the Video Image Conriuni cation and Retrieval Image Based Infor- 
mation System (VICAR/IBIS). VICAR has been under development at 
the Jet Propulsion Laboratory (JPL), Pasadena- California, since 
the mid- sixties. Originally created to process image data from 
planetary exploration programs, VICAR has been expanded to include 
applications in earth resources, land use, biomedicine and as- 
tronomy. IBIS is a geographic information system which allows the 
conversion of georeferenced data to an image format for use with 
remotely sensed data. IBIS was built upon VICAR, permitting image 
to image registration. Images of di-^Terent scale from any number 
of datasets can be superimposed, allowing corresponding pixels to 
represent the same geographic location (Bryant and Zobrist, 1976). 

A land use/land cover classification from digital Landsat 
data, in conjunction with slope and elevation inf', .-.nation from 
United States Geological Survey (USGS) Digital Elevation Model (DEM) 
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topographic data, provided the image base for this soil loss study. 
The slope and length of slope coefficients were also derived from 
the DEM. USDA Soil Conservation Service soil maps provided the 
soil erodability and soil loss tolerance information. The rainfall 
factor was obtained from National Oceanic and Atmospheric Adn.inis- 
tration (NOAA) isopluvial maps. The soil and isoplu\ial maps were 
digitized to facilitate digital processing. The conservation prac- 
tice factor was not inclu''’.J in this experiment due to the inability 
of Landsat to resolve conservation practice techniques at a site. 

The USDA Soil Conservation Service in So...is, California pro- 
vided technical assistance for this work. They also enabled the 
field verification crew to gain access to several of the ranches in 
the study area. The Soil Conservation Service is presently pur- 
suing solutions for the problem of soil loss in this watershed. 

They are concerned with sediment loss reducing crop productivity, 
as well as the problem of sediment accumulation in the upper water- 
shed and in Mugu lagoon. A prediction of soil erosion rates will 
be of considerable assistance to the SCS. As such SCS personnel 
are interested in the results from this project; this thesis will 
be presented to them. 
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CHAPTER 2 
BACKGROUND 


Study Area 

The Santa Paula, California, seven and one-half minute quad- 
rangle comprises the northeastern border of the Oxnard Plain, a 
fertile region of prime agricultural land (Figure 2). This area 
displays a typical Mediterranean climate of warm, dry summers and 
cool, wetter winters. Average annual rainfall is approximately 
seventeen inches (NOAA Environmental Data and Information Center, 
1978). The Oxnard plain is the site of numerous studies by the 
Geography Remote Sensing Unit, University of California, Santa 
Barbara (Estes et al. 1979, Stow et al. 1980, Hallada et al . 1981, 
Tinney et al . 1981). These experiments involved land cover class- 
ification, land use/land cover change detection and automatic clus- 
ter labelling. 

The Santa Clara river, the main drainage for Ventura County 
runs from northeast to southwest through the Santa Paula quadrangle. 
The floodplain of the Santa Clara river is the site of many lemon 
and orange groves, aS well as some avocado orchards. These are 
fairly mature orchards, averaging more than ten years in age. Run- 
ning east-west through the southern portion of the quadrangle is the 
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Las Posas Valley. Containing rich alluvial soils eroded from 
the marine sedimentary sequences of South Mountain, the Las Posas 
Valley is also the site of lemon and orange groves. Avocados and 
row crops such as celery and squash are a major source of agricul- 
tural income here as well. 

The maximum elevation of the Santa Paula quadrangle is approx- 
imately 2250 feet on South Mountain, compared to about 200 feet at 
the lowest point on the Santa Clara River floodplain. New avocado 
orchards extend into some of the lower canyons of South Mountain 
and in the foothills north of the Santa Clara River. Avocados are 
planted on slopes up to and exceeding ten degrees. The steep 
slopes in conjunction with the sparse cover provided by the immature 
orchards present a potential soil loss problem in these canyon 
lands (Figure 3). 

Cattle grazing and horse ranching predominate both in the foot- 
hills of South Mountain and in the foothills north of the Santa 
Clara river. Vegetation here consists mainly *>f Mediterranean 
annual grasses and chaparral. However, the grass cover is extremely 
sparse in some locations. There is also a small oak woodland com- 
munity on the north slope of South Mountain. The hills and moun- 
tains are subject to frequent fire. Removal of the natural vege- 
tation due to these fires contributes to accelerated soil losses. 
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Figure 3 Immature avocado orchard on slope of approximately ten 
degrees. Orchards such as these are a major cause of 
increased soil erosion in the Santa Paula area. 
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Surface Wash 

For the purposes of this thesis, discussion of movement of 
soil downs lope is limited to surface wash. Surface wash is defined 
by Young (1972, p. 62) as: 

...the downslope transport of regolith material 
across the ground surface, through the agency of 
moving water. 

Other forms of movement of debris downslope for the Santa Paula 
quadrangle include creep, mudflow, earthflow, debris avalanche and 
debris slide as defined by Vames (1958). These processes occur 
in the study area to some extent. However, the dominant downslope 
form of transport of regolithic materials in the Santa Paula quad- 
rangle is surface wash, as evidenced by field inspection. 

There are two distinct processes involved in surface wash. 

The first is the impact of rain on the ground, termed raindrop 
impact. The second process, the flow of water across the ground, 
surface flow, can be subdivided into two categories: sheetwash, 

whereby the ground is entirely or mostly covered by a moving layer 
of water; and, rillwash, in which the water flows in very small 
channels. Channels in rillwash change their size and location on 
the slope. There are two separate effects of soil detachment by 
rainfall and runoff: soil detachment, the removal of particles 

from their original position in the soil; and wash transport, the 
movement of the detached particles downslope. 
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Loss of soil in agricultural regions is termed accelerated 
erosion. Accelerated erosion occurs in locations with cleared 
vegetation, bare ground or weakened soil structures. This is the 
case in much of the agricultural portions of the Santa Paula quad- 
rangle. 

Under undisturbed conditions, as found in the naturally vege- 
tated regions of the Santa Paula quadrangle. Young (p. 66) dis- 
cussed three important aspects of soil movement: 

1) Raindrop impact is a more powerful agent of soil 
detachment than surface flow, but is relatively 
less important as an agent of transport. The 
detachment caused by raindrops does not vary with 
position on the slope nor, substantially, with 
slope angle. 

2) The lowering by surface wash varies directly 
with the slope angle. The precise relation 
varies with soil and surface conditions, but a 
linear proportional increase with the size of 

the slope angle is the best general approximation. 

3) The ground loss caused by surface wash varies 
with approximately the 0.6 power of distance from 
the slope crest. 

In addition. Young (p. 66) made the following discrimination: 


In respect of surface wash, a distinction may be 
made between control by detachment and control by 
transport. If the agents of transport are more 
than able to carry away dovvnslope all the mater- 
ial supplied by detachment, the rate of ground 
loss is determined by the rate of detachment. If 
control is by wash transport, detached material is 
supplied faster than it can be transported, so that 
the transporting media are then always fully loaded. 
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In other words, under control by wash transport, the rate of soil 
loss increases lower on the slope, but under control by detachment, 
the soil loss is independent of distance from the ridge. Slope 
angle is more important for control by wash transport than control 
by detachment. 

Soil Loss Models 

A large number of models have been developed for predicting 
erosion and sediment yields. With few exceptions these models are 
based on soil, geologic, climatic, topographic, vegetative and land 
use/1 and cover information. Musgrave (1947) developed the first 
quantitative evaluation of factors in soil erosion. His equation, 
a first approximation of predicted sheet erosion, is based on soil, 
crop cover, degree of slope, length of slope and precipitation in- 
formation. This model is the forerunner of the Universal Soil Loss 
Equation. The equation has had wide application in the prediction 
of average annual sheet erosion. A modified Musgrave equation 
created by substituting the soil erodability and rainfall factors 
from the USLE has improved this model (Pacific Southwest Inter- 
Agency Coimiittee, 1968). 

The Pacific Southwest Inter-Agency Committee (1968) developed 
a sediment loss model suitable for use in the Pacific Southwest. 

A rating system based on surface geology, soils, climate, runoff, 
topography, ground cover, land use and upland erosion was developed 
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This is a highly subjective model; the input variables are overly 

dependent on user interpretation. Predictions are in terms of low, 

medium or high sediment loss. To date, this model has not been 

tested or evaluated in a quantitative manner. » 

Meeuwig (1971) developed a model predicting soil stability on 
high elevation rangeland. A predictive equation was developed using 
soil, vegetation cover, slope, and organic matter content para- 
meters. Meeuwig's equation explains approximately 75 percent of the 
variance of the log of erosion. The author warned that this equa- 
tion should not be applied indiscriminately because it was derived 
from erosion measurements based on fixed amounts of simulated -ain- 
fall on small homogeneous plots. The geographic range of this 
model is therefore limited. 

Foster and Meyer (1972) developed a mathematical relationship 
for the continui ty-of-mass transport, and an equation relating 
detachment of sediment by runoff and sediment load. For known soil, 
precipitation and topographic characteristics, the erosion pattern 
along a slope was predicted. This method has sound theoretical 
development; however, further testing is required to extend the 
range of its applicability. 

A practical model for predicting mass wasting was presented by i 

Swanston et al. (1980) and described by Logan (1981, p. 4), based 
on: 



14 


...a subjective evaluation of the relative stability 
of an area using soils, geologic, topographic, cli- 
matic, and vegetation indicators obtained from 
aerial photographs, maps and field observations, 

(and) a limited strength-stress analysis of the 
unstable sites using available or easily generated 
field data. 

This model was simulated by Logan in a data base approach. However, 
the Swanston et al . model is still being developed; and, moreover, 
it is best suited for forest watersheds. 

These models are not suitable for soil loss prediction within 
a geographic information system format for three reasons: 

(1) They are unacceptable for generating image data sets 
(Meeuwig, 1971; Foster and Meyer, 1972); 

(2) They are not applicable to a mixed agricultural and 
rangeland region (Meeuwig, 1971; Swanston et al. 1980); 
and, 

(3) There is insufficient quantitative verification of the 
model (Musgrave, 1947; Pacific Southwest Inter-Agency 
Committee, 1968; Meeuwig, 1971; Swanston et al. 1980). 

A model that is suitable for soil loss prediction in a geo- 
graphic information system format, the Universal Soil Loss Equa- 
tion, is presented in Chapter Three. 
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CHAPTER 3 » 

SOIL LOSS PREDICTION 

I 

Universal Soil Loss Equation 

The Universal Soil Loss Equation developed by the United States 
Department of Agriculture Soil Conservation Service (Wischmeier and 
Smith, 1965) is appropriate with respect to the three categories 
discussed in Chapter Two for geographic information system soil 
loss modelling. The USLE predicts rill and sheet erosion due to 
rainfall "^or agricultural regions in terms of tons of soil lost per 

1 

acre per year. The Universal Soil Loss Equation was originally 
developed from 10,000 plot years of runoff end soil loss data col- 
lected from forty-seven research stations in twenty-four states. 

Studies were undertaken which measured the contribution of rain- 
fall, soil properties, slope angle, length of slope, crop cover 
and conservation practice to the loss of soil. A prediction of soil 
loss was obtained for regions east of the Rocky Mountains. Recent 
work has extended the USLE to the entire United States, including 
range and forest lands (USDA Science and Education Administration, 

1978). I 

The basic soil loss equation is: 

A = R*K*L*S*C*P 
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where: 

A * Predicted soil loss in tons/acre/year, 

R » Rainfall Factor, 

K ■ Soil Erodability Factor in tons/acre/year, 

L * Length of Slope Factor, 

S = Slope Gradient Factor, 

C = Crop Management Factor, and 
P » Conservation Practice Factor. 

R,L,S,C and P are dimensionless coefficients. The product of 

R,K,L,S,C and P provides an estimate of soil loss measured in tons 

per acre per year. 

Rainfall (R) 

Research from the Runoff and Soil Loss Data Center of the 
Agricultural Research Service (ARS), Purdue University, indicates 
that when factors other than rainfall are held constant, soil loss 
from cultivated fields is directly proportional to the product of 
two rainfall characteristics: the total kinetic energy of the 

storm, and the maximum thirty minute rainfall intensity, (El) 
(Wischmeier, 1962). The yearly sum of the El values from each 
storm provides the R value, quantifying the impact energy of rain- 
fall in dislodging soil from the ground. The amount of soil which 
actually leaves the area is affected by the K,L,S,C and P coef- 
ficients. 

The USDA Soil Conservation Service, Davis, California (1977) 
has produced a curve relating the two-year, six-hour maximum rain- 
fall to the annual R factor. Two-year, six-hour maximum rainfall 
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information is available from the NOAA Weather Atlas (Miller et al. 
1973) in the form of an isopluvial map. The R value can be deter- 
mined for all locations in the United States, although the scale 
of the map is quite small. 

Soil Erodability (K) 

Soil erodability is defined as the tendency for a specific soil 
to erode, holding the R,L,S,C and P coefficients constant 
(Wischmeier and Smith, 1965). The soil erodability, K, is the 
average soil loss measured in tons per acre per year per unit 
erosion index, El, on a unit plot of land. A unit plot is land in 
continuous fallow, tilled up and down, 72.6 feet long, with a nine 
percent slope. With the other factors at unity, K = A/EI. Several 
hundreds of plot-years of measurement on twenty- three major soil 
groups quantified the K value. Based on the rainfall; K was deter- 
mined for each soil, ranging from .03-. 69 (Olson and Wischmeier, 
1963). 

In a later study (Wischmeier et al. 1971), a soil erodability 
nomograph was developed which predicts erodability based on five 
characteristics of the soil. These are percent silt and very fine 
sand, percent sand greater than .10 mm, organic matter content, 
structure and permeability. As percent silt and fine sand increase, 
erodability goes up. By the same token, as the percent sand in- 
creases, erodability decreases, an effect produced by the greater 
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surface area in silt and the greater potential for dis lodgement. 

As organic content increases, eroda'jility decreases, and as the 
permeability and granularity increase, the erodability is reduced. 
The twelve K factors in use by the SCS are .10, .15, .17, .20, .24, 
.28, .32, .37, .43, .49, .55 and .64. The higher the number, the 
more erodable the soi"". 

All non-U. S. Forest Service land of Ventura County has been 
mapped based on soil characteristics by the Ventura County USDA 
Soil Conservation Service (1969). A project is underway in Cali- 
fornia to evaluate K factors for all soils. This project has been 
completed by the SCS personnel for Ventura County; the K values 
have been calculated for these soils. 

Length of Slope (L) 

The length of slope factor, L, and the slope n>"adient factor, 

S, are calculated separately. However, in the application of the 
soil loss equation they are evaluated together. Slope length is the 
distance from a point in a watershed to the origin of runoff for 
that point, generally a ridge or a peak (Zingg, 1940). The longer 
the slope, the greater the possibility for erosion due to the in- 
creased contribution of runoff from above the site. The potential 
for partial area flow is increased as well. Length of slope data 
were collected by Wischmeier et al . (1958) from 136 location years 
at ten locations. This work indicated that soil loss per unit area 
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is proportional to some percent of the slope length. The L factor 

is the ratio of field soil loss to that from one with a 72.6 foot 

0 3 

long slope. Therefore, L = (x/72.6) ‘ , where x is the length of 
slope. 

Slope Gradient (S) 

The slope gradient factor, S, is computed from the nine per- 
cent slope unit plot. A nine percent slope has a value of one when 
the length of slope equals 72.6 feet. An initial study on the 
effect of slope angle on soil loss by Zingg (1940) concluded that 
soil loss varies as the 1.4 power of percent slope. Seventeen 
years of data collected in two locations by Wischmeier et al . 

(1958) indicated that soil loss is equal to (.43 + .30S + 0.045^), 
where S is the percent slope. In California, for slopes less 
than nine percent, the relation (0.43 + 0.30S + .043S^) / 6.617, 
is used to calculate the S factor. For slopes of nine percent or 
greater, (S/9) defines the S coefficient (USDA Soil Conserva- 
tion Service, Davis, 1977). 

Crop Management (C) 

The crop management faccor, C, is the ratio of soil loss 
from land cropped under the specific conditions encountered, to the 
theoretical soil loss of the same crop under tilled continuous fallow 
conditions (Wischmeier, 1960). The C factor is based on the kind 
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of crop growing in a specific area and the protection it provides 
against erosion. Crop stages considered in the calculation of the 
C values are rough fallow, seedling, establishment, growing, 
mature crop and residue or stubble. Treatment of crop residue has 
a major impact on the potential for erosion. Residue on the sur- 
face provides more protection than residue plowed under. Conditions 
evaluated in the determination of the C value include: the kind of 

residues that are left after the crop has been harvested, whether 
they are removed or remain, and, whether they are left on the sur- 
face or plowed under. 

Data assessing these variables of crop management were col- 
lected from 10,000 plot years of runoff and soil loss data from 
forty-seven reserach stations in twenty- four states. Regression 
and analysis of variance techniques were applied to investigate 
relationships and interaction effects of crop management on soil 
erosion. Tables were presented which provide C values under the 
varying conditions of crop management practice on cultivated agri- 
cultural land east of the Rocky Mountains (Wischmeier and Smith, 
1965). 

Current work has extended the C values to include undisturbed 
land encountered west of the Rocky Mountains (USDA Science and Edu- 
cation Administration, 1978). Undisturbed land categories include 
pasture land, rangeland, woodland and idle land. The extension 
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of C values is based on three separate but interrelated factors: 

(1) Canopy cover. The canopy reduces erosion by rainfall by 
lessening the impact of raindrops on the soil surface; 

(2) Percent of vegetation in close contact with the ground. 

100 percent vegetation cover reduces the impact of rain 

on the soil to near zero. Vegetation also slows the speed 
of overland flow, reducing its energy* for erosion; and, 

(3) Vegetation litter on the surface. Litter reduces the 
ability of the rain to erode by lessening the impact of 
the rain, as well as limiting the transporting ability 
of runoff. 

A chart is provided by the Davis, California, SCS (1977) which 
displays C values for undisturbed areas based upon canopy cover 
type and height of cover, percent ground cover and type of ground 
cover. 

Conservation Practice (P) 

The conservation practice factor, P, is the ratio of soil loss 
from the specific conservation practice encountered, to the soil 
loss from up and down hill cultivation (USDA Science and Education 
Administration, 1978). Again, this relation is empirically derived 
from test data in which different types of conservation practice 
were evaluated. Conservation practices reduce the runoff inten- 
sity by altering the direction of the water flow. They also 
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modify the transport of sediment, generally reducing the effective 
gradient of the field in question. A value of one is assigned to a 
plot lacking soil conservation techniques. Practices that reduce 
soil loss include contouring, cntour stripcropping and terracing. 

Soil Loss Tolerance (T) 

A soil loss tolerance, T, has been calculated; the maximum 
permissible annual soil loss (USDA Agricultural Research Service, 
1961). The soil loss tolerance is quantified in terms of tons of 
soil loss per acre per year. The T value is assigned by soil 
scientists based on their judgement of the amount of erosion that 
will still permit a high level of crop productivity to be sustained 
economically and indefinitely. A site is considered to have a soil 
loss problem if the calculated A value exceeds the T value. 
Estimates of the T factor are based on the following criteria: 

(1) Maintenance of an adequate rooting depth for crop production. 
On shallow soils over hard rock, it is necessary to conserve 
the soil. A low T value will be assigned; 

(2) Soils that can be renewed with tillage, fertilizers, organic 
matter and other practices. Renewable soils will be assigned 
a higher T factor. Nonrenewable soils include soils developed 
on hard and soft rock with unfavorable nutrient or textural 
composition. Duripans, natric, strong calcis, gypsic, petro- 
calsic, and salic soils are included in this category; and. 
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(3) Value of the nutrients lost and the cost of maintenance 
of waterways. Ditches may become clogged with sediment;- 
a low T value will be assigned. 

The Ventura County SCS has assigned T values for the soil 
series encountered in Ventura County. Tolerances range from one to 
five tons per acre per year depending on the field conditions. 

Soil Loss Information System Models 

Remotely sensed data has been used to derive specific coef- 
ficients of the Universal Soil Loss Equation. Morgan et al . (1979) 
used color and color infrared 70 mm photography at a scale of 
1:60,000 to calculate the cropping management factor of the Univer- 
sal Soil Loss Equation. Detailed information regarding plowing 
practices and crop residue cover was discernable on both the color 
and color infrared imagery at this scale. In a later study, Morgan 
et al. (1980) used the same imagery to calculate the conservation 
practice factor. They were able to resolve tillage, contour strip- 
cropping and grass lined waterways for an agricultural region in 
Wisconsin. 

Stephens and Cihlar (1981) analyzed Landsat II, simulated 
Landsat D (Thematic Mapper), and simulated Spot I data to classify 
a region composed of pasture, forest and cropland. The simulated 
Landsat D and Spot I data were resampled to 10 meter resolution. 
Accuracies were 88 percent, 92 percent and 94 percent for the 
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Landsat II, Landsat D and Spot I data, respectively. Next, 

Stephens and Cihlar calculated the correlation between the ratio 
of the near infrared/red reflectance to the C coefficient for selec- 
ted study sites. The Pearson product moment correlation coefficient 
ficients, R, for simulated Spot I, Landsat D and Landsat II data 
were .97, .95 and .85, respectively. This is an important study 
because the correlation was determined on the basis of vigor of the 
vegetation. Further research is necessary along this line of in- 
quiry. The potential of higher resolution sensors with improved 
spectral characteristics is obvious here. 

Researchers have input the Universal Soil Loss Equation into 
geographic information systems for soil loss prediction. The prob- 
lem with most of this work is the reliance on manual derivation of 
topographic data and an inconsistent source of crop management 
information, or both. Singer et al. (1976) developed a computer 
simulationof soil loss using the Universal Soil Loss Equation. In 
this study, vegetation maps provided the crop management factor and 
topographic data were manually derived from topographic sheets for 
a rangeland site in northern California. The authors also attempted 
to model the flow of sediments after erosion based on a computer 
routine by Kling (1973). However, no basis for the flow of sedi- 
ments was discussed in this paper. 

Degani et al. (1979) presented Soilcart, an interactive com- 
puter simulation of soil loss. In this study, crop cover was coded 
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into the geographic information system from air photographs. Slope 
and slope length were manually derived from topographic sheets. The 
interactive aspect of Soil cart make it an important teaching tool; 
however, for soil erosion modelling, its usefulness is limited. 

Patterson and McAdams (1980) input the Universal Soil Loss 
Equation to a geographic information system to produce erosion 
hazard potential maps. Landsat MSS data for 1973 and 1978 provided 
the image base. Slope and length of slope information was derived 
from soil maps. A high level of flexibility in processing was 
developed in this study. Land cover for 1973 and for 1978, slope 
classes, slope length classes, and 1973 and 1978 erosion potential 
images were displayed. However, topographic data was manually en- 
coded in this work. A truly automated soil loss information system 
was not developed. 

Armond Joyce (1979) applied a modified Musgrave's equation for 
soil loss prediction in a GIS. The modified Musgrave's equation is 
a hybrid version of the Universal Soil Loss equation using the soil 
erodability and rainfall factors from the USLE. This research pro- 
duced an erosion hazard-reforestation needs assessment. Joyce's 
approach was similar to Patterson and McAdams. The processing for 
both of these projects was performed at the National Space Tech- 
nology Laboratory (NSTL) Earth Sciences Laboratory (ERL), Bay 
St. Louis, Mississippi. The main difference between the two studies 
was the use of Musgrave's modified soil loss equation and the 
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manual derivation of topographic factors from topographic maps by 
Joyce. In addition, a forested watershed was assessed instead of 
an agricultural region. Again, a fully automated system was not 
developed in this research. 

Berger and Jensen (1980) modeled soil loss and flood potential 
due to urbanization in a humid subtropical southeastern environment. 
First, soil loss was predicted using the Universal Soil Loss Equa- 
tion. N&xt, the increased rate of runoff was simulated based on a 
Soil Conservation Service runoff model. Finally, the increased rate 
of peak flow was predicted using a mathematical model developed at 
the Georgia Institute of Technology. Large scale, 1:6,000 color 
and black and white photographs provided land cover information. 
Topographic data was derived from digital terrain data; however, a 
methodology for deriving slope and length of slope was not described. 
Of primary interest is the coupling of two models to derive an 
assessment of sediment lost and runoff generated. Also unique to 
this study is the application of the soil loss equation to an urban 
watershed. 

A data base approach for prediction of deforestation induced 
mass wasting events was accomplished by Logan (1981). A completely 
automated GIS was used to predict mass wasting in a forested water- 
shed. The erosion model used in this study is a modification of 

the Swanston et al . (1980) approach. Swanston's model is based on 
ten natural and management induced hazard factors. Logan reduced 
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the ten variables to five, These were slope gradient, proximity to 
roads, soils, precipitation, and vegetation cover. The five fac- 
tors were transformed into data base images using the VICAR/IBIS 
image processing system. The images were summed or multiplied based 
on a design developed by Logan to produce a relative mass wasting 
index. Landsat MSS data provided the crop management factor and 
slope information was derived from Digital Terrain Tapes furnished 
by the National Cartographic Information Center. An automated ap- 
proach to simulating a mass wasting model in a geographic informa- 
tion system was developed. However, the Swanston et al . model is 
not appropriate for use in an agricultural location. 

A fully automated geographic information system for soil loss 
prediction in an agricultural region does not exist. Joyce, and 
Patterson and McAdams approached this ideal with the use of Landsat 
data, but fell short with their manual derivation of the topographic 
variables. Berger and Jensen mention the use of digital terrain 
data to calculate slope and length of slope information, but did not 
present a methodology. Logan successfully combined Landsat and 
digital terrain data for a forested watershed. It remains for this 
thesis to develop an automated geographic information system for 
soil loss prediction in an agricultural region. The data and proc- 
essing for an automated soil loss information system are presented 
in Chapters Four and Five. 
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CHAPTER 4 
GEOBASED DATA 


IsopTuvial Map 

The isopluvial map for the Ventura County area was obtained 
from the NOAA Precipitation-Frequency Atlas of the Western United 
States, Southern California subsection (Miller et al. 1973). Two- 
year, six-hour isopluvials were calculated by NOAA from multiple 
regression equations based on precipitation measurements from 
recording and nonrecording gauges. Isopluvials were drawn on maps 
based on the regression prediction for 47,000 grid points in the 
western United States. Topography is an important factor in the 
interpolation of the isopluvials. There are approximately 3300 
recording and nonrecording gauges for the western United States. 
The densest network of gauges are located in the Ventura County 
area; hence, the isopluvials here are very accurate. 

Soil Maps 

Soil maps were obtained from the Ventura County office in 
Somis, California, of the United States Department of Agriculture 
Soil Conservation Serv'ce. The soil maps are at a scale of 
1:24,000, and correspond to the United States Geological Survey 
7.5 minute topogrdphic sheets. Three soil map sheets comprise a 
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7.5 minute quadrangle. In the east-west direction each sheet encom- 
passes 7.5 minutes; in the north-south direction they displace 

2.5 minutes. Map sheets 27, 23 and 24 arranged north to south 
represent the Santa Paula 7.5 minute quadrangle. The soil data 
depicted on these maps were compiled on an orthophoto base created 
from aerial photographs taken in 1959 and 1965. Cultural features 
such as roads, railroads and buildings are resolvable on the maps, 
as are agricultural field patterns. Natural features such as 
creeks and mountains are distinguishable too. 

Soil series and soil phases are delineated on the soil maps. 

The determination of these soil mapping units is based on char- 
acteristics of the soil profile, including the analysis of the 
horizons as to number, order, thickness, texture, structure, color, 
organic content and reaction (acid, neutral or alkaline). Soil 
series have similar hori zonal thicknesses and arrangements. Dif- 
ferences in texture, slope and structure of the soil define a soil 
phase. For example, Callequas is a soil series, and Calleguas 
shaly loam, 9-30 percent eroded, is a particular phase of the Cal- 
leguas series. For this portion of Ventura County, each soil 
series includes from one to six soil phases. 

When two or more soil series are so intermixed that differenti- 
ation between them is difficult or impossible, the resulting asso- 
ciation is termed a soil complex. Castaic-Balcom complex 15-30 
percent eroded is an example of a complex. In locations where 
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little, if any, soil development occurs, an alternative designation 
is used. These soil categories include scindy alluvial plain, land- 
slides, igneous rock land, terrace escarpment, sedimentary rock 
land, fill land, pits and dumps, riverwash and gullied land. These 
are not well established soils and the characteristics used to de- 
fine the other soils do not apply. 

Digital Elevation Model 

The National Cartographic Information Center under the National 
Mapping Division distributes Digital Elevation Model topographic 
data produced by the United States Geological Survey. A DEM is 
defined by the USGS National Cartographic Information Center 
(1980, p. 1) as a: 

...data file of a sampled array of elevations for 
a number of ground positions that are at regularly 
spaced intervals with a defined accuracy. 

DEM topographic data for the Santa Paula quadrangle was obtained in 
the form of a nine track 1600 bit per inch (BPI) computer compatible 
tape (CCT). 

A DEM file may be created from a number of data sources includ- 
ing existing contour plates, profiles of terrain models scanned 
with stereo photogranmetric equipment, or from computer driven 
orthophoto devices. The source for the DEM is high altitude photo- 
graphy acquired at a nominal scale of 1:78,000. The data are 
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processed and resampled to yield one digital elevation model for 
each 7.5 minute quadrangle. The spatial resolution of the DEM is 
30 meters. The vertical resolution is one meter with a root mean 
square (RMS) elevation error of seven meters. The data are a vast 
improvement over Defense Mapping Agency (DMA) digital terrain tapes, 
which are generated from 1:250,000 scaie topographic maps and are 
sampled to a 225 foor horizontal grid. Stow (1978) found that the 
DMA data failed to represent accurately elevation values due to 
the coarse resolution of the 1:250,000 topographic sheets. Some 
values were offset positionally by as many as two pixels. 

DEM topographic data are categorized by the US6S National 
Cartographic Information Center (1980, p. 3) into one of three 
levels, based on editing, enhancements, and spatial structures. 

These levels are: 

(1) DEM 1, a grid of raw elevation data tiiat has been 
edited for gross blunders and has not been keyed 
to planimetry; 

(2) DEM 2, elevation data that has been smoothed for 
consistency, enhanced to remove noise, and fil- 
tered to reduce data volume. DEM 2 has not 
been keyed to planimetry; and, 

(3) DEM 3, elevation data that has been edited and 
modified to be consistent with planimetric fea- 
tures such as streams, road, and shorelines. 

DEM obtained from the NCIC for this study is rated at level one. 

DEM topographic data are aligned along Universal Transverse 
Mercator (UTM) grid lines. The DEM data base is in the initial 
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stage oi development. As of November 1, 1980, fewer than 50 
7.5 minute quadrangles were available from the USGS for the state 
of California. 

Ldndsat MSS Data 

Digital Landsat Multi spectral Scanner (MSS) data from 
June 14, 1978 for the Santa Paula area was obtained from the Earth 
Resources Observation System (EROS) data center, Sioux Falls, 

South Dakota, in the form of a 1600 BPI nine track computer com- 
patible tape (CCT). The NASA Landsat satellite series consists o^ 
three satellites in near polar orbit at an altitude of 920 kilo- 
meters. Landsat images the earth using a mul tispectral scanner 
(MSS) during its 18 day sun synchronous orbit. A standard Landsat 
scene encompasses a 185 by 185 kilometer area, containing approx- 
imately 7,581,600 picture elements (pixels). The instantaneous 
field of view (IFOV) is seventy-nine meters, however, due to over- 
sampling in the across track direction, each pixel represents a 
79 by 57 meter area on the ground. There are approximately 2340 
lines and 32^0 samples per Landsat scene. This figure represents 
an average, due to variation in the sample rate as a result of the 
variable-velocity scanning mirror on Landsat. The MSS output images 
in four wavelength increments of the electromagnetic spectrum. 

These are band 4 .5-. 6 microns (green), #5 .6-. 7 microns (red). 
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#6 .7-.8 microfiS (near infrared), and #7 .8-1.1 microns (near 
infrared) (Sabins, 1978). 
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CHAPTER 5 

DIGITAL PROCESSING 


Generalized Processing Flow 

The variables of the Universal Soil Loss Equation were trans- 
formed into georeferenced data planes generated from the data 
sources discussed in Chapter Four utilizing the VICAR/IBIS image 
processing and geographic information system. Landsat data served 
as the image base for the soil loss information system. The Landsat 
data were registered to the Santa Paula 7.5 minute quadrangle. The 
geometric rectification resampled the Landsat data to sixty meter 
square pixels (0.9 acre), generating a 231 line by 192 sample image. 
Sixty meter square pixels, serving as the basic resolution unit for 
this study, were selected because sixty meters is an even multiple 
of the thirty meter Digital Elevation Model topographic data. 

The rainfall, soil erodability, length of slope, slope grad- 
ient and soil loss tolerance images were all registered to the 
Landsat image base. The rainfall, soil erodability and soil loss 
tolerance images were generated to correspond to the 60 meter 
square 231 line by 192 sample grid. DEM, the source for the slope 
and length of slope images, was resampled to this grid as well. 

The grid was outlined on the Santa Paula topographic sheet. 
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facilitating registration and accuracy assessments for this work. 
The generalized processing flow is depicted in Figure 4. 

Soil Erodability 

Digitizing routines developed at the Geography Remote Sensing 
Unit (GRSU), University of California, Santa Barbara, were utilized 
to create digital data planes from the SCS soil maps on the Talos 
digitizing table. Three control points were required to transform 
digitizer coordinates to geographic coordinates. The registration 
of the maps to the digitizing table created a 231 line by 192 sam- 
ple sixty meter square grid. Since the 7.5 minute quadrangle is 
composed of three soil sheets, each map was registered separately. 

Each soil mapping boundary was digitized using the Talos 
electronic cursor, creating files of the X,Y coordinate pairs. 
Generally, three or four files were necessary to digitize each of 
the three soil maps. The three or four files for each soil sheet 
were concatenated into one file for each soil map. The three soil 
map files corresponding to the Santa Paula 7.5 minute quadrangle 
were then concatenated into one file. This file was transferred to 
XYGEN, converting the strings of X,Y number pairs to a file suit- 
able for processing by VICAR/IBIS. 

The XYGEN file of line segments was input to the IBIS program 
POLYSCRB. POLYSCRB transformed the standard polygon file into an 
image file of polygon borders. Input data in coordinate-point 
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format was used to form a raster base containing polygon outlines. 
Next, the VICAR program LINEAR was applied to the POLYSCRB'd data- 
set, transforming the data from byte to halfword, allowing 32,767 
increments. 

The generation of a halfword dataset was a critical prerequi- 
site for the IBIS program PAINT. PAINT converted the image file 
of polygon borders from POLYSCRB to an image in which each polygon 
was represented by a unique ON. An important parameter invoked was 
the Pborder option. Pborder indicated that the polygon borders on 
the output file were erased; the borders of the polygons were 
included as data within the polygon. This was crucial due to the 
large number of borders on the soil maps. Each border was a pixel 
wide; the loss of these data would have been unsatisfactory. Unfor- 
tunately, the Pborder option of the POLYSCRB routine randomly as- 
signs border pixels to neighboring polygons. This feature reduces 
the accuracy of the digitized datasets near the borders. 

PAINT produced 1498 polygons. These polygons, corresponding to 
soil mapping units, were reduced to soil erodability classes. 

On the soil map, the soil erodability for each soil mapping unit 
was coded, ranging from 1-20. The numbers did not refer to the 
erodability, but were simply values in a lookup table constructed 
for ease in data interpretation. Gullied land, landslides, igneous 
rock land, sedimentry rock land, fill land, riverwash, badland, 
sandy alluvial plain, terrace escarpment and pits and dumps soil 
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mapping units were assigned unique numbers, as were the regular 
soil series and phases. 

A labelling procedure was developed to convert each soil 
polygon to a soil erodability class. The spatial location of each 
PAINT 'ed polygon was matched to the soil mapping unit on the SCS 
soil maps. The lookup value for each soil mapping unit on the SCS 
maps corresponded to the number of the polygon generated by PAINT. 

The VICAR program HSTRETCH was run on this data to reduce the 
original 1498 polygons to 20 classes. 

Reduced K soil polygons ranging from one to twenty were then 
assigned their appropriate K value based on the Universal Coil 
Loss Equation. The regular soil series and phases were stretched 
to .15, .20, .23, .32, .37, and .43. VICAR can process only inte- 
ger values, therefore, each value was multiplied by 100 before 
input to the HSTRETCH program. This increase by two orders of 
magnitude was noted and accounted for later in the processing. In 
all subsequent descriptions, decimal coefficients will be described. 
The values were actually converted to integers through a multi- 
plication by one or two orders of magnitude, and then rectified 
later in the processing. 

K values are not calculated by the SCS for the special soil 
classes because a wider range of erodability within each of these 
special classes necessitates field examination for each site. It 
was not possible to examine all of these sites, therefore, they 
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were deleted from further study; the K values were set to zero. In 
the final erosion prediction soil losses were zero for the special 
soil classes, comprising 8.8 percent of the study area. A large 
majority of these soils are located within the 'ianta Clara River 
channel, an unimportant agricultural site. The river class was 
also excluded from further study, as will be discussed later in 
the chapter. A K value was assigned to each cell on the soil 
erodability image, ranging from 0-.43 (Figure 5). 

Rainfall 

Isopluvials from the NOAA Precipitation-Frequency Atlas 
(Miller et al. 1973) were transferred to the Santa Paula 7.5 minute 
quadrangle. Two-year, six-hour precipitation ranged from 1.6 inches 
in the southern portion of the quadrangle to 2.0 inches in the north. 
The VICAR/IBIS processing flow for the rainfall coefficient was 
similar to the processing flow for the soil erodability coefficient. 
On the Talos digitizing tablet the isopluvials were digitized. The 
digitizer program linked the X,Y coordinate pairs to XYGEN, con- 
verting the X,Y coordinate pairs to a file suitable for further 
IBIS processing. POLYSCRB was then used to transform the XYGEN 
isopluvial vector file into an image file containing polygon 
borders. The raster dataset was input to PAINT, converting the 
raster file of isopluvial borders to an image with a unique ON 
assigned to each polygon. The VICAR program HSTRETCH was run on 
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Registered soil erodability image generated from digitized 
Soil Conservation Service soil maps. Darker tones 
represent less erodable soils; lighter tones represent 
more erodable soils. 
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the PAINT'ed image, transforming the polygon value to the appro- 
priate R coefficient for each rainfall region. An R coefficient 
ranging from .50-. 70 was assigned to each cell on the grid 
(Figure 6). 

Digital Elevation Model 

DEMVIC, a DEM login program, converted the raw DEM data to an 
unlabeled VICAR image in 15 bit format. The VICAR program VSAR was 
applied to the output from DEMVIC to convert the file to VICAR 
format. North was skewed 90 degrees to the east on the DEM. The 
VICAR program PLOT rotated the data counterclockwise, to orient 
north to the top of the grid. 

The DEM was listed off on a lineprinter for visual inspection. 
Canyons extending off the boundary of the Santa Paula quadrangle 
were good features for comparing the DEM to the topographic sheet. 
The location of points on the DEM was in excellent correspondence 
with the topographic map. The four corners of the Santa Paula DEM 
and the Santa Paula 7.5 minute quadrangle were used as ground 
control points (GCP's) in the registration of the DEM to the quad- 
rangle. The VICAR program GEOMA was utilized for registration. 

Two methods to geometrically rectify an image using GEOMA are 
available through VICAR at UCSB. The first is the nearest neighbor 
approach, which does not interpolate grey values. The grey value 
of the nearest pixel from the old grid system is input to the new 
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Registered precipitation image generated from digitized 
NOAA Precipitation-Frequency Atlas. The darkest tone 
represents two-year, six-hour rainfall of 1.6 inches; the 
lightest tone represents two-year, six-hour rainfall of 
2.0 inches. 
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grid system. Intensity values do not change; however, the spatial 
arrangement of the data is altered somewhat. The second method 
applies a bilinear interpolation function to the data. Here, the 
new values have a higher spatial correlation to the old data, but 
the intensity values are altered by the interpolation process 
(Castleman, 1979). The nearest neighbor approach was used to keep 
the data as faithful to the original DEM as possible. 

Geometric rectification resampled the DEM to a 60 meter square 
grid of 231 lines by 192 samples to maintain compatibility with the 
Landsat MSS data. The DEM were tested against the topographic sheet 
to determine the accuracy of the registration. Approximately 40 
points from the grid were analyzed. Peaks of mountains were the 
best topographic features to test because they were easily located 
on the topographic sheet and the DEM. Based on the topographic 
grid, pixels from the DEM corresponded quite well with the topo- 
graphic sheet. No values were more than one pixel off; most were 
in perfect correspondence. The DEM elevation image is presented 
in Figure 7. 

Length of Slope 

An algorithm that computed length of slope from Digital Elevation 
Model (DEM) topographic data was developed. Length of slope is the 
distance from a point in a watershed, to the source of runoff for 
that point, generally a ridge or a hilltop. A search was conducted 
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Figure 7 Registered elevation image for the Santa Paula quadrangle 
from USGS Digital Elevation Model topographic data. Dark 
tones represent the lowest elevations; light tones 
represent the highest elevations. 
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to determine if this calculation using digital topographic data had 
been developed elsewhere. Collins (1975) developed a package of 
programs which calculate watershed characteristics from DEM. Length 
of slope is not included in this package. Junkin (1979), NASA 
National Space Technology Laboratories (NSTL), Mississippi, 
developed a set of routines that compute slope, slope length and 
aspect from Digital Elevation Model topographic data. However, 
length of slope here is the distance across an elevation cell, a 
within cell slope length. The sum of cell lengths along a slope to 
the source of runoff is required for the USLE slope length coef- 
ficient. Therefore, it was necessary to develop a length of slope 
program in-house. 

A buffer one pixel wiae was applied to the border of the DEM, 
enabling slope length co be calculated for cells on the margin of 
the DEM. Starting at line one, sample one, a three by three moving 
window surrounding this coordinate was created. The direction of 
the steepest slope between the middle cell of the window and any 
of the surrounding eight elevation cells was determined. The dis- 
tance bf.'tween the middle cell and highest cell surrounding it was 
calculated by the Pythagorean theorem. The direction of the move- 
ment was noted. 

The window then moved so that the steepest cell from the old 
center became the new center. Direction of steepest slope was again 
determined using the new cell as the center. This direction was 
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limited to forty- five degrees from the previous movement. 

Pythagorean distance was calculated for the new movement. The process 
continued until the slope did not continue uphill. This was the 
source of runoff for the initial cell; the routine located a ridge 
or a peak. The sum of the distance of all the moves was tallied 
and applied to the initial cell. The window then moved to the next 
sample same row and repeated the process. The routine continued 
until the length of slope for all cells of the DEM were calculated. 

An important feature of the algorithm is a preview subroutine. 
If there were two or more equally steep slopes. Preview was called. 
The preview subr-outine set up another window around the equally 
steep slopes. The steepest slope from these equally steep slopes 
was determined. The cell with the steepest new slope became the 
cell for the slope length calculation. Processing then continued 
as before. 

Slope length in meters for each cell was generated by the 
length of slope routine. This was converted to feet to correspond 
to the USLE L factor formula (Figure 8). The formula, 

L * (x/72.6f'\ where X equals the slope length in feet, was input 
to the VICAR program F2. F2 allows general arithmetic functions to 
be applied to one or two input images in byte or halfword format. 

A halfword slope length coefficient for each cell was produced. 
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Figure 8 Registered length of slope image Generated from DEM. 

Dark tones represent short slone lenaths; light tones 
represent longer slope lengths. 
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Slope Gradient 

Calculation of slope gradient from the DEM was implemented 
using a program developed by Professor Jeff Dozier (1979), Depart- 
ment of Geography, University of California, Santa Barbara. Slope 
angle from a DEM is defined as the gradient of a plane tangent to 
the centerpoint on a 3 by 3 grid. The four nearest neighbors of 
each center cell were used to calculate the slope angle of the 
plane. The output for each cell was generated in radians. The 
radians were converted to percent slope to coincide with the USLE 
S coefficient (Figure 9). 

For slopes less than nine percent, the relation 

(0.43 + 0.30S + -0433^) / 6.617, where S equals percent slope, was 

used to calculate the S factor. For slopes exceeding nine percent 
1 3 

the relation (S/9) ' defined the S coefficient. The S coefficient 
was calculated in three steps. A binary mask was formed creating 
two images from the slope image. The first image set slopes 
greater than nine percent to zero and the first equation was ap- 
plied. The second image set slopes less than or equal to nine 
percent to zero and applied the second equation. Finally, the two 
images were added, generating one slope gradient image with an S 
val ue for each cel 1 . 
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Figure 9 
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Registered slope gradient image generated from OEM. Dark 
tones represent lower slope angles; light tones represent 
steeper slopes. 
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Crop Management 

The Landsat login program VERTSLOG read the raw Landsat tape 
and transferred the data to disk. VERTSLOG also corrected for 
skew, aspect ratio, synthetic pixels, mirror scan velocity profile, 
panorama effect and band to band misregistration. The aspect ratio 
was set to 1.41, generating 79.9 by 56.7 meter pixels. 

The Landsat image was then registered to the Santa Paula seven 
and one half minute quadrangle. From the Santa Paula and surround- 
ing quadrangles, 1980 color infrared (CIR) imagery at a scale of 
1:40,000, and the Landsat sub-scene, nine ground control points 
were located. The GCP's were mostly highly reflective buildings, 
or intersections of main highways and railways. The location of the 
GCP's on the image were tied to their respective location on the 
quadrangle. A bilinear interpolation was applied using GEOMA to 
resample the Landsat data to the 60 meter square, 231 line by 192 
sample grid. A band 5 display of the registered Landsat sub-scene 
of the Santa Paula quadrangle is presented in Figure 10. 

The geometrical ly rectified Landsat image was tested against 
the topographic sheet to determine registration accuracy. Approx- 
imately forty randomly selected features on the Landsat image were 
compared to their locations on the topographic sheet. A majority 
of the features were in perfect correspondence on the two grids, 
and none were more than one pixel off. 
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Figure 10 Registered band 5 Landsat sub-scene of the Santa Paula 
quadrangle. 
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The VICAR program USTATS performed an unsupervised clustering 
algorithm on the June image, USTATS incorporates a modified version 
of an algorithm suggested by Tyron and Bailey (1972) for clustering 
large numbers of observations. USTATS initially formed 222 clus- 
ters. This was reduced to 138 by USTATS after removing one pixel 
clusters, and then further reduced to 142 after combining clusters 
that overlapped by one standard deviation. 

The 142 clusters produced by USTATS were reduced to 100 spec- 
tral classes using the Numerical Taxonomy System (NTSYS) program 
package, developed by Rohlf et al. (1974). PDIST, a NTSYS support 
program, calculated a matrix of standardized Euclidean distances 
between clusters in the spectral domain. The standardized distance 
matrix was then input to NTSYS, creating a dendogram of the clus- 
ters. The dendogram was used to condense the 142 initial clusters 
to 100 clusters by merging classes or clusters with the classes 
or clusters to which they were most similar. Some of the small 
clusters were removed from further processing, while other similar 
clusters were merged. The VICAR program EDSTATS merged the clus- 
ters through a recalculation of new class centroids by pooling old 
centroids weighted by their number of pixels. New class variances 
were computed for each channel. The 100 clusters were input to the 
VICAR program FSTCLSPR, a mul tispectral classifier using an algo- 
rithm which combines parallepiped and Bayesian maximum likelihood 
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techniques. All pixels were classified into one of the 100 classes, 
with 1.8 percent of the data unclassified. 

The classes were now labeled. The band 5 L^ndsat spectral 
channel was displayed on a Cathode Ray Tube (CRT) utilizing the 
Grinnel Image Processing System and QDIPS software (Dozier, 1980). 
The classified image containing 100 classes was overlayed on the 
band five display. QDIPS facilitated the display of seven colors 
representing the 100 classes. While the classes were displayed on 
the CRT, the CIR imagery and the 7.5 minute Santa Paula quaorangla 
were used to label the statistical clusters to their land use/land 
cover classes. 

After considerable field inspection and air photo analysis, 
ten land use/land cover classes were derived (Table 1). These 
classes did not fit a strict Anderson et al. (1976) classification 
scheme; but instead were designed to distinguish the different 
effects crop management practices have on erosion potential. 

Unfortunately, the classification was not as accurate as anti- 
cipated; there was considerable confusion between orchards and 
natural vegetation. After examination of the 7.5 minute quadrangle 
and air photos, two conclusions were made; 

(1) No orchard or row crop existed above 800 feet in the Santa 

Paula quadrangle; and 

(2) Orchards on slopes exceeding 10 degrees were extremely rare. 


54 


TABLE 1 

LAND USE/LAND COVER CLASSES 


Class 

Description 

Mature Orchard 

Citrus or avocado orchards with greater 
than 50 percent ground cover 

Inmature Orchard 

Citrus or avocado orchards with less 
than fifty percent ground cover 

Row Crop 

Any crop grown in rows: including 

celery, lettuce, beans, etc. 

River 

Water, sand, rock and riparian 
vegetation confined to the Santa 
Clara River channel 

Urban 

Urban, residential, industrial or 
transportation land use 

Dense Sod 

A very dense cover of grass including 
golf courses and a turf growing outfit 

Grass 

Mostly Mediterranean annual grasses 
with very little soft chaparral 

Chaparral 

Medium to high density soft and hard 
chaparral 

Oak Woodland 

Sites with at least a fifty percent 
oak canopy cover 

Barren 

Sites mostly devoid of vegetation, not 
including river areas 
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Therefore, a binary mask was created using the DEM elevation 
and slope data identifying all elements greater than 800 feet and 
all slopes greater than 10 degrees. Classes less than 800 feet in 
elevation and with a gradient less than 10 degrees retained cluster 
numbers 1-100. Classes with an elevation exceeding 800 feet and/or 
with a gradient exceeding 10 degrees were renumbered 101-200. The 
classified image with 200 classes was now labeled into the ten land 
use/land cover classes. Class numbers over 100 were generally 
grass, chaparral , oak woodland or barren. Class numbers under 100 
were mature orchard, inmature orchard, row crop, dense sod, urban 
or river classes. The binary mask very accurately separated or- 
chards from natural vegetation. 

However, 1.8 percent of the quadrangle remained unclassified. 
The unclassified regions were generally small isolated clusters of 
pixels; large groups of unclassified pixels did not exist. SIMPLIFY, 
a VICAR applications program which removes high frequency components 
(noise) from a classified or stratified image was applied to the ten 
class stratified image. The classified pixels were not affected, 
but the unclassified pixels were assigned to the majority class of 
the surrounding eight pixels. The classified image is presented in 
Figure 1 1 . 

C coefficients were assigned to each of -he land use/land cover 
classes from the charts and tables provided by the USDA Science and 
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Figure 11 Registered ten class land use/1 and cover image nenerated 
from Landsat MSS data, and slope and elevation informa- 
tion from DEM. 
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Education Administration (1978) and supplemental technical notes 
from the USDA Soil Conservation Davis, California (1977) (Table 2). 


TABLE 2 

C COEFFICIENTS 


Class 

C 

Description 

Mature Orchard 

03 

Approximately 60 percent cover 

Immature Orchard 

.44 

Approximately 20 percent cover 

Row Crop 

.40 

Six year row crop sequence with 
winter cover 

River 

.00 

Not calculated by the SCS 

Urban 

.00 

Not calculated by the SCS 

Dense Sod 

.01 

100 percent cover 

Grass 

.07 

Grass, no appreciable canopy 

Chaparral 

.01 

Brush or bushes, 80 percent cover 

Oak Woodland 

.01 

Trees but no appreciable low 
brush, 80 percent cover 

Barren 

.75 

No vegetative canopy 
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Air photo interpretation and field measurements quantified the 
C coefficients. The derivation of the C coefficient for the orchard 
classes was based on a combination of air photo analysis and field 
inspection. The calculations for the row crop and dense sod coef- 
ficients were based on the published charts. For the grass and 
chaparral classes, 100 foot transects for three sites within each 
class provided an average ground cover. Air photo analysis deter- 
mined the percent ground cover for the oak woodland, and the barren 
class C coefficient was obtained from SCS tables. The SCS does not 
provide guidelines for the calculation of C coefficients for the 
river and urban classes. Therefore, they were set to zero; pre- 
dicted soil loss for these sites will be zero tons per acre per 
year. These classes are not agriculturally important and are loca- 
ted in relatively flat locations. 

Conservation Practice 

Since conservation measures are not resolvable on Landsat nor 
available as collateral data, the conservation practice coefficient 
was not applied in this experiment. The conservation practice 
factor was set to one, indicating a lack of erosion control tech- 
niques. Soil conservation measures are not prevalent in the Santa 
Paula area. Howevrr, locations where erosion reducing techniques 
are applied will have lower erosion rates than those predicted from 
this model . 
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Soil Loss Tolerance 

Each soil phase and series has been assigned a soil loss toler- 
ance by the Soil Conservation Service. The soil loss tolerance 
image was generated in the same manner as the soil erodability; the 
only difference arose in the final labelling of the soil loss toler- 
ance values. The tolerances ranged from one to five tons per acre 
per year. Again, the sandy alluvial plain, landslides, igneous 
rock land, terrace escarpment, sedimentary rock land, fill land, 
pits and dumps, riverwash and gullied land classes were not as- 
signed soil loss tolerances by the SCS. For the purposes of this 
study they were assigned tolerance values of zero and applied to 
8.8 percent of the soils. Figure 12 displays soil loss tolerances 
for the Santa Paula quadrangle. 

Predicted Soil Loss 

The five data planes representing the R,K,L,S and C coeffi- 
cients were multiplied together based on the Universal Soil Loss 
equation to derive the A coefficient. The VICAR program F2 can only 
process a maximum of two input images; therefore, the generation of 
a predicted soil loss image was a four-step process. First, the 
R * K images were multiplied. In subsequent steps the (R * K) ^ L, 
the (R * K * L) * S, and finally the (R*K*L^S)*C images were 
multiplied. A problem with the four step approach is that 
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Figure 12 Registered soil loss tolerance image generated from 
digitized Soil Conservation Service soil maps. The 
lightest tone represents a soil loss tolerance of 1 
ton/acre/year; the darkest tone represents a soil loss 
tolerance of 5 tons/acre/year. 
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truncation occurs at each stage. This problem is discussed more 
fully in Chapte<^ 3ix, A prediction of soil loss in tons per acre 
per year was calculated for each 60 meter cell in the 231 I'ne by 
192 sample grid for the Santa Paula quadrangle (Figure 13). 

The final step was the subtraction of the predicted soil loss 
image from the soil loss tolerance image. The resulting film 
writer image displays "'ocalions where predicted soil erosion ex- 
ceeds soil loss tolerances (Figure 14). 
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Figure 13 


Registered predicted soil loss based on uIS inputs to 
the Universal Soil Loss Equation. Lightest tones 
depict predicted soil loss less than 5 tons/acre/year. 
Darkest tones represent predicted soil 'jsses exceeding 
30 tons/acre/year. 
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Figure 14 Registered differenced image of predicted soil loss and 
soil loss tolerance images. Light tones represent 
locations where the predicted soil loss is less than the 
soil loss tolerance. Dark tones represent sites where 
the predicted soil loss exceeds the soil loss tolerance. 
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CHAPTER 6 

ACCURACY AND ANALYSIS 


Land Use/Land Cover Classification 

A stratified random sample was used to determine the accuracy 
of the land use/1 and cover classification. From the VICAR applica- 
tions program SAMPLE, 25 random samples from each of the ten 
classes were obtained for accuracy assessment. Line and sample 
coordinates for each point were listed by the program, 25 for each 
class. Coordinates of the stratified random sample were transfer- 
red to the topographic sheet to aid in the location of the test 
sites. At this time, SAMPLE did not function properly; the coor- 
dinates for some test sites were incorrectly depicted by the SAMPLE 
routine. These samples were rejected. This was a random process 
and did not bias the results of the accuracy assessment. 

There was easy access to the agricultural portions sf the 
quadrangle in the Santa Clara River Valley and the Las Posas 
Valley. Many roads traversed the fields and orchards in the low- 
lands. Here, the method for calculating the accuracy of the clas- 
sified image was field inspection. It was a relatively simple task 
to drive near the sample sites and record the land cover. 
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In the naturally vegetated regions on South Mountain and the 
foof'ills north of the Santa Clara River Valley, another method for 
determining the accuracy of the sample points was necessary. There 
were fewer roads in this rugged terrain, and some of the ranchers 
did not permit access to their property. Therefore, air photo 
interpretation provided the accuracy assessment for the naturally 
vegetated regions. Using 1:32,000 scale CIR imagery from February 
1979, discrimination between the four land cover classes was rela- 
tively easy. These classes were grass, chaparral, oak woodland 
and barren. The large scale of the U-2 imagery, and the response 
of the vegetation on the infrared film, allowed an accurate assess- 
ment of land cover for the naturally vegetated regions. 

From the stratified random sample, 84.3 percent of the pixels 
were correctly classified (Table 3). By weighting each class by 
its representative area, 87.2 percent of the quadrangle was cor- 
rectly classified. These are the highest classification accuracies 
obtained for the Oxnard Plain region. Stow et al. (1980), with only 
seven classes of interest, achieved 70 percent accuracy in the 
neighboring Oxnard and Camarillo quadrangles. There was consider- 
able confusion between orchards and chaparral without the elevation 
and slope information obtained from the DEM. Orchards and chap- 
arral both have large percentages of bare soil interspersed with 
the vegetation, and Landsat spectral data alone does not differ- 
entiate the two classes adequately. 
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TABLE 3 

CLASSIFICATION ACCURACY 
SANTA PAULA QUADRANGLE 


LANDSAT CLASSIFICATION 





c> 

/ 



/ 

o 



mature 

orchard 

2 ^ 

88% 

1 


1 

1 






immature 

orchard 


84% 


1 

1 






row 

crop 


3 

81.3% 








river 




^7% 

1 






urban 

1 

1 

2 

4 

1^ 

66.6% 






dense 

sod 



2 



846% 





grass 







22/ 

917% 

2 



chaparral 

1 






2 

2^ 
91 7% 



oak 

woodland 

1 







3 

2 ^' 

833% 


barren 

1 

2 







1 


1^/ 
81 3% 


Pixels Correctly Classified 84.3% 
Quadrangle Correctly Classified 87.2% 
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The accuracies of the land use/land cover classification are 
relatively consistent for all classes except the urban class. The 
low accuracy here, 66.6 percent, is due to considerable confusion 
with the river class. The topographic properties are similar, and 
the spectral similarities between roads, buildings, and river 
gravel make discrimination difficult. However, forthe study of agri- 
cultural er^^sion, river and urban categories are not classes of 
interest, and were not included in the final soil loss calculation. 

Row crops were on the lower end of classification accuracy, 

81.3 percent. There was some confusion between row crops and im- 
mature orchards due to the large amount of bare ground found on 
some row crops at different times of the year, and the large per- 
cent of bare soil on the immature orchards. The effects of the 
confusion are lessened somewhat by the fact that the C coefficients 
for row crop and irmature orchard are .44 and .40, respectively. 

This similarity reduces the impact of mi sclassifi cation error on 
the final soil loss prediction. 

Most of the remaining confusion between classes also occurred 
in classes with relatively similar C coefficients. This is logical, 
as the C factor is largely based on vegetative cover protecting the 
soil. Stephens and Cihlar (1981) followed this line of reasoning 
by correlating the infrared/red ratio with the C coefficient in a 
study described in Chapter Three. There was some confusion 
between barren and immature orchards. Both of these classes have 
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high C coefficients, .74 and .40 respectively. There are very few 
instances of classes with widely disparate C coefficients display- 
ing confusion. These considerations enhance the overall acceptabil- 
ity of the accuracy value of 87.2 percent. 

Collateral Data 

Accuracy assessments for the collateral datasets were obtained 
in the following manner. From the original stratified random sam- 
ple, ten random samoles from each class were used, excluding the 
river and urban classes. These two classes were eliminated from 
further investigation. A subsample of 80 sites remained, ten sites 
for each of the eight classes of interest. The 80 values from the 
collateral images were tested against the corresponding values for 
the sources of those images. For example, 80 rainfall values on the 
rainfall image were tested against the corresponding 80 values on 
the rainfall map. A problem is that the rainfall, soil erodability 
and soil loss tolerance are discrete datasets. The slope and length 
of slope images are continuous datasets. Two different approaches 
for calculating accuracies were necessary. For the discrete data- 
sets, a standard accuracy percentage scheme was employed. For the 
continuous datasets, a Pearson product moment correlation coeffi- 
cient was calculated. 

The eighty sites on the rainfall image were tested against the 
isopluvial polygons drawn on the Santa Paula quadrangle. As 
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there were only four isopluvial polygons, the accuracy of this 
dataset was 100 percent. This does not infer that geobased LISLE 
models depiction of the rainfall is 100 percent accurate. It does 
mean that the geobased model describes the rainfall factor from the 
LISLE completely. Obviously, orographic uplift and subsequent 
aridity on the lee side of a mountain are not adequately accounted 
for on the 1:2,000,000 scale NOAA isopluvial maps. However, model- 
ling this effect is beyond the scope of this research. 

The soil erodability and soil loss tolerance accuracies were 
tested in the same manner. The eighty site sample on the soil 
erodability and soil loss tolerance images were tested against the 
Soil Conservation Service soil maps. The same 231 by 192 60 meter 
square grid system was outlined on the SCS soil maps. Line and 
sample coordinates on the soil property images corresponded directly 
to the line and sample coordinates on the soil maps. The accuracy 
for both the soil loss tolerance and soil erodability images tested 
against the soil maps was 96.25 percent. Each of the soil property 
datasets had three observations in error. Nearly 1500 soil poly- 
gons were manually edited to the appropriate soil loss tolerance 
and soil erodability values based on the SCS designations. Very few 
of the polygons were mislabeled in the labelling procedure; the 
accuracy of the soil erodability and soil loss tolerance images is 
quite good. 
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Originally, it was the goal of this study to measure the 
topographic variables, slope and length of slope, in the field. 
However, it soon became apparent that this was not possible due to 
limited access to the naturally vegetated regions. Therefore, the 
topographic test data was generated from the topographic sheet. The 
same random sample was applied to test the L and S coefficients for 
accuracy. The slope was measured from the topographic sheet in the 
following manner. A distance of 200 feet was measured along the 
steepest direction through the point in question. This point 
became the midpoint of the slope. The elevation change was noted 
by sunning the coutour lines (20 foot contour interval) for the 
200 foot distance, corresponding approximately to the 60 meter DEM 
cell. The resulting rise over run was the slope for the site. 

The slope from the topographic sheet was correlated to the 
slope generated from Professor Dozier's program. The ^earson product 
moment correlation coefficient, R, was .93, signific? t to the 
.0001 level. This figure displays very good corre'' cion between the 
two values. The topographic sheet and the DEM provide independent 
data sources for the correlation calculation. It is no^ valid to 
consider the slope information from the topographic sheet ground 
truth. The calculation of slope from twenty foot contours on a 
1:24,000 scale topographic map is possibly less accurate than slope 
generated from the DEM. However, the topographic sheet does provide 
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another source of slope information, and thus can provide a basis 
for comparison of topographic factors. 

For the digital slope length method, the spatial resolution of 
the DEM is an important consideration in the calculation of slope 
length. The resampled 60 meter square grid on the DEM does not 
adequately account for microrel ief. Small rises are not discerned 
on the digital topographic data. Very likely, the DEM exaggerates 
slope length, missing some of the microrelief which would terminate 
a slope length. The DEM is sensitive to larger scale topographic 
factors. These are considerations that must be evaluated in terrain 
analysis implementing DEM topographic data. The DEM is, however, 
the highest resolution digital topographic data available for 
general use by the public. 

To test the length of slope, a line was drawn along the 
steepest slope through each test site. This line followed the 
steepest slope until the slope became negative. At this point the 
Source of runoff for the initial point was located. On the steeper 
portions of the quadrangle the slope length ended at a ridge at a 
peak. On the floodplain, with its more subdued topography, it was 
more difficult to determine the end of the slope; here, slight 
variations in elevation concluded the slope length. The Pythagorean 
distance between the initial point and the end of the slope was cal- 
culated as the slope length. 
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Length of the slope derived from the DEM was correlated to 
the manually derived length of slope. The Pearson product moment 
correlation coefficient, R, was .81, significant to the .0001 level. 
Again, this is simply the correlation of the DEM lengtli of slope 
and the map-derived lergth of slope. The map-derived length is 
certainly not the best method to calculate slope length because the 
scale on the topographic sheet may be too small for an accurate 
representation. However, from the relatively high correlation coef- 
ficient, it is apparent that the two sets of measurements are 
reasonably coincident. 

Finally, the predicted soil loss derived from the geobased soil 
loss model was tested against the manually derived coefficients of 
the LISLE. The same set of random samples was applied. The coef- 
ficients for the manual USLE were obtained from the rainfall map, 
the SCS soil maps, slope and length of slope from the topographic 
sheet, and the groundtruth and air photo analysis of the study area. 
These coefficients were multiplied together based on the USLE, 
yielding an A value. The geobased coefficients were also multiplied 
together using the VICAR program F2, yielding an A value. Because 
the USLE is a multiplicative function, it is appropriate to trans- 
form the geobased and manually based value to their natural loga- 
rithms before calculating a correlation coefficient (Li, 1964). 

The Pearson product moment correlation coefficient, R, of the log 
transformed data was .91, significant to the .0001 level. This 
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figure displays very good correlation between the geobased and 
manually derived USLE soil loss predictions. Classification ac- 
curacy from the collateral datasets and predicted soil loss are 
surmarized in Table 4. 


TABLE 4 

COLLATERAL AND PREDICTED SOIL LOSS DATASET ACCURACIES 


Dataset 

Accuracy 

Rainfall 

100.00 

Percent 

Soil Erodability 

96.25 

Percent 

Length of Slope 

.81 

Correlation 

Slope Gradient 

.93 

Correl ation 

Soil Loss Tolerance 

96.25 

Percent 

Predicted Soil Loss 

.91 

Correlation 
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Sensitivity Analysis 

An informal sensitivity analysis performed on the effects of 
the five coefficients on the final predicted soil loss is presented 
in Table 5. 


TABLE 5 

SENSITIVITY ANALYSIS 


Coefficient 

Range 

Percent Change 

R 

.50 - .70 

140 

K 

.15 - .43 

286 

L 

1.35 - 2.92 

221 

S 

1.00 - 34.10 

3412 

C 

.01 - .75 

7500 
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The greatest variability was obtained from the slope and crop 
management coefficients. The misclassification of chaparral 
(C = .01) as barren (C = .75) created an error in the A factor of 
7500 percent. However, this error occurred only once in the sample 
analyzed. Slope is a very sensitive coefficient of the LISLE. The 
slope factor depicted by the DEM is fairly accurate, and error 
within this calculation is unlikely to extend the full range of the 
S value. Length of slope and the soil erodability factors are 
moderately sensitive to variation within their respective coef- 
ficients. The soil image was quite accurate and does not account 
for much error in the A calculation. The length of slope is the 
least accurate of the datasets. Fortunately, it has the second 
lowest sensitivity of the five coefficients, reducing the impact of 
miscalculation of the L coefficient on the predicted soil loss. The 
rainfall factor is quite stable, and probably does not add error to 
the A calculation. 

One final test '-•as undertaken as a check on the VICAR proces- 
sing sequence. The F2 program, which performs arithmetic functions 
on linages, can process a maximum of two images at a time. Multipli- 
cation of the five datasets of the USLE was accomplished in four 
steps as descrioed in the previous chapter. Unfortunately, only 
integer values are allowed as output from F2. From the series of 
four multiplications, some error may have been introduced by trun- 
cation. A test was conducted to determine if this was the case. 
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The five coefficients of the geobesed model for the same test 
sample (80 values) were manually multiplied based on the LISLE. The 
results were tested against the geobased automated A value. The 
Pearson product moment correlation Coefficient R value was .98, 
significant to the .0001 level. Obviously, little error was intro- 
duced by the truncation of values during the F2 processing sequence. 

Analysis of Predicted Soil Loss 

The images of predicted soil loss and soil loss versus soil 
loss tolerance show that the orchards and row crops along the flood- 
plain do not display soil loss prob^iems. Soil losses here a'^e less 
than five tons per acre per year and, for the most part, are less 
than the soil loss tolerances. There are several locations in the 
flood plain where predicted soil losses exceed soil loss tolerances. 
In the northeastern portion of the Las Posas Valley there is a re- 
gion of steep slopes p"* anted in row crops. These sites display an 
increased predicted soil loss. Along the western portion of the 
Santa Clara River, row crops planted on soils with soil loss toler- 
ances of one ton per acre per year exceed their soil loss tolerance. 
Immature orchards, mostly avocado, extending into the canyons of the 
mountainous areas display high predicted soil loss rates. The 
problem is not as widespread on the Santa Paula quadrangle as was 
previously theorized. 
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Slope and crop management appear to be the strongest indica- 
tors of soil loss problems. On the fairly level sites soil loss 
is relatively minor; on the steeper and longer slope sites soil 
loss is considerable. The grass sites often have minimal vegeta- 
tion cover, arid soil loss is potentially high. Predictions consis- 
tently exceed soil loss tolerances in the grass areas on South 
Mountain and in the foothills north of the Santa Clara River Valley. 
Chaparral (C = .01) is superior to grass (C = .07) in protecting 
the ground from rainfall erosion losses. The prevalent fires on 
South Mountain remove the chaparral cover; a grass canopy returns. 
This succession causes an increase in predicted erosion for the 
mountainous areas. 

Several trends are apparent from analysis of the predicted 
soil loss image: 

(1) Soil loss is not a problem on the floodplains and valleys, 
except with row crops on soils with low soil loss tolerances 
or on the steeper slopes; 

(2) Immature avocado orchards display high predicted soil losses 
in the canyon sites, however, the problem is not extensive; 
and, 

(3) The steeper mountainous locations are serious soil loss areas, 
especially those areas exhibiting a grass canopy. 

Thus, the geobased soil loss model accurately represents the 
manual Universal Soil Loss Equation. Although imperfect, this 
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soil loss information system has potential to inventory large areas 
for predicted soil loss with a savings in both time and money over 
conventional ground sampling (Hanuschak et al . 1979). This model is 
probably best applied as a prescreening mechanism to identify 
major areas of soil loss, in which a user is more concerned with 
the relative amounts and the spatial extent of the predicted soil 
losses. The geobased model certainly provides th's information. 
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CHAPTER 7 
CONCLUSION 


This thesis research has demonstrated the capability of a geo- 
graphic information system including Landsat and collateral data 
to map potential soil loss using the Universal Soil Loss Equation. 
This geobased soil loss information system accurately depicted the 
Universal Soil Loss Equation for a 7.5 minute quadrangle, an agri- 
cultural region exceeding 100 square miles. The prediction of soil 
erosion for a small site is useful, but the ability to project 
this prediction to a larger area provides a much greater perspective 
of soil loss problems to resource managers. 

As an example, Mugu lagoon, at the mouth of Revlon Slough, 
is experiencing serious sedimentation problems. Sediment is trans- 
ported from the upper watershed; much of it is from the Santa Paula 
quadrangle. The predicted soil loss image clearly shows that loca- 
tions of steeper slopes under grass canopy are the major contribut- 
ing sources for soil lost from this area. Resource managers can 
implement this information by introducing a more protective plant 
cover in this area, such as chaparral or sod-forming grasses. The 
introduction of immature orchards in canyons extending into the 
mountainous areas is increasing soil loss problems. Conservation 
practice techniques such as contour terracing can reduce the 
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erosion hazard. Another alternative is a zoning ordinance pro- 
hibiting or placing restrictions on the planting of orchards in 
critical areas identified by the geobased soil loss model. The 
decision is the resource manager's; the data are provided by the 
VICAR/IBIS soil loss model. 

The intermediate images developed and employed in the research 
display ancillary information in an effective manner. The soil 
erodability and soil loss tolerance images reduce the soil maps 
from approximately 1500 soil mapping units to iDore comprehensive 
five- to eight-class images. These simplifications are helpful 
because the distribution of the soil erodability and soil loss toler- 
ances for soil classes is not obvious on complex soil maps. The 
length of slope image graphically depicts the distance from each 
pixel to the source of runoff for that point; topographic sheets 
do not provide as clear a representation as the image. The land 
use/land cover classification image displays information in a form 
more interpretable than Ventura County land use maps. And finally, 
the VICAR/IBIS data base is a set of overlain digital rasters, al- 
lowing versatility in processing and display. 

Additional information can be generated from the intermediate 
data sets. Since the borders of the soil polygons exists as an 
IBIS file, soil properties, such as permeability and PH, which have 

been calculated by the SCS for the soil mapoing units, can easily 
be displayed as images. Permeability influences infiltration 
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rates and the magnitude of ground water recharge; soil PH can 
provide an assessment of crop suitability for a site. From the 
slope gradient image, discrete ranges in slope can be generated 
(i.e., five to ten, ten to fifteen, fifteen to twenty percent 
slope, etc.), and the location and the percentage of each slope 
increment can be determined. Thus, there is considerable potential 
for an automated approach to data handling and display; the 
limitation is the imagination and the creativity of the user of the 
geographic information system. 

Statistics from the intermediate data sets are useful as well. 
Histograms for the soil erodability and soil loss tolerance images 
have been generated from this work. This information can help the 
Soil Conservation Service as well as county planners, as such data 
are currently unavailable from other sources. A histogram of the 
land use/1 and cover classification was also produced. Again, per- 
centages of land use/land cover classes for this quadrangle are 
unavailable from the Ventura County Planning office. 

The mapping and display of watershed characteris'^ics from DEM 
also shows great promise. Typical parameters derived from digital 
terrain data include slooe gradient and aspect. In this work we 
have developed length of slope, which is used in the LISLE, but 
there are other applications for this variable as well. An indica- 
tion of tne amount of moisture available for vegetation at a 
specific site can be obtained from the length of slope. Shorter 
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slope lengths have less runoff and they are, therefore, less 
likely to become completely saturaccd. Sites with longer slope 
lengths have a greater source for overland flow providing an ad- 
ditional moisture availability. For runoff routing schemes, the 
prediction of volume of runoff through a point can be facilitated 
by length of slope information. Other topographic information, 
such as the delineation of watershed divides, can also be obtained 
from the DEM. Thus, a package of watershed variables derived from 
the DEM has considerable potential for watershed analysis. 

Extension of this research might involve a prediction of soil 
erosion rates for an entire watershed. From tnis predicted soil 
loss, a sediment routing scheme similar to Kling's (1973) could 
be developed. A oredicted soil loss mao could be generated, as 
well as a map depicting potential sediment accumulatic..s for the 
watershed. Designed within the context of a GIS, these simulations 
would enhance a resource manager's ability to assess erosion prob- 
lems for an agricultural region. 

For this study, the intermediate datasets were accurate and 
flexible in their representation of environmental information. The 
predicted soil loss and the soil loss versus soil loss tolerance 
image: correctly identified predicted soil loss problem areas for 
the Santa Paula 7.5 minute quadrangle. The strength of a geographic 

information system lies in the accuracy of the input data, and the 
ability cf the system to process and display the data in a suitable 


83 


fashion for resource analysis. In this, it is the authors' opinion 
that the VICAR/IBIS soil loss information system succeeded on all 
counts. 

The following is a summary of conclusions drawn from this 
study. Listed below are conclusions from the accuracy tests. 

(1) The correlation, R, of the predictions of the VICAR/IBIS soil 
loss model and the manual LISLE values was .91 (log transform), 
thus explaining most of the variance of the relationship. 

The system was highly effective in identifying areas of 
critical soil loss with this degree of error. 

(2) The Landsat classification accuracy using a collateral binary 
mask derived from slope and elevation information from the 
DEM was 84.3 percent, with an estimated 87.2 percent of the 
quadrangle correctly classified. This accuracy is very good; 
most of the misclassification was from classes with 
similar C coefficients. Accuracies were considerably lower 
without topographic information. The registration of the 
Landsat to the Santa Paula quadrangle with nine GCP's was 
excellent, employing a bilinear interpolation resampling 
approach. 

(3) The DEM is an accurate nigh resolution data source. Precise 
registration of the DEM to the Landsat from the four 
corners of the DEM was achieved. 
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(4) Slope gradient from the DEM as tested against the topographic 
map displayed a correlation coefficient, R, cf .93, a very 
good correlation. 

(5) Length of slope was the least accurate dataset. Correlation 
to the test data provided an R coefficient of .81. More work 
is necessary on the length of slope routine. It is unclear 
if the scale of the DEM permits a sufficiently precise 
depiction of the microrelief necessary for a length of 
slope calculation. 

(6) Both the soil and rainfall images were input to the system 
with little error. 

(7) Processing using the VICAR F2 sequence was acceptably accurate 
in spite of possible roundoff errors. The five-variable 
multiplication as tested against a manual multiplication cor- 
related with a coefficient, R, of .98. 

(8) Sensitivity analysis showed that slope and crop cover were 
potentially the most sensitive factors of the five LISLE 
coefficients in predicting soil loss. 

(9) The most serious soil loss problems are in areas with steeper 
slopes under grass cover. Inmature orchards in the foothills 
of the mountains display high predicted soil losses; however, 
these sites are not extef’sive. 
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General conclusions from this study are as follows. 

(1) Incorporating topographic data in Landsat classification is an 
extremely efficient method to improve classification accuracies. 
However, a thorough knowledge of the study area and of 
environmental processes affecting land use/land cover are pre- 
requisites for success in this endeavor. 

(2) Watershed characteristics can be effectively derived from DEM 
data. This study implemented slope and length of slope 
information, as well as elevation data. Length of slope infor- 
mation can be used in applications other than as the USLE 
variable, and watershed divide information can also be obtained 
from DEM topographic data. 

(3) Digitized soil maps are a powerful information source. Once 
the borders are coded, attributes other than erodability and 
soil loss tolerance, such as permeability and PH, can be 
incorporated into a GIS. 

(4) Useful statistics from soil maps, topographic data and land 
cover data were obtained. The percentage of slopes within a 
certain range, the proportion of specific soil erodability and 
soil loss tolerances for soils, and the percentage of land 
covered in certain land use/land cover categories can be easily 


calculated. 
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(5) The 60 meter square 231 line by 192 sample grid for all of the 
datasets, as well as the topographic sheet and soil maps, was 
an effective coordinate system. Ground survey and accuracy 
assessments were greatly facilitated by the use of this grid. 
The grid also aided the registration of the Landsat and the 
DEM to the Santa Paula quadrangle. 

(6) The use of topographic maps for accuracy verification in 
this study is questionable. The scale of the topographic 
sh“»ts tfiay not adequately depict slope or length of slope. 
However, field data for these factors were not obtainable. 

(7) All of the data used in this study are available in the United 
States, except the DEM. Defense Mapping Agency topographic 
data may be substituted for the DEM, but the DMA data are not 
as accurate. In time, the DEM will be more widely available. 

(8) VICAR/IBIS proved to be a powerful, flexible, and well ir.e- 
grated image processing and geographic information system. 
Problems with VICAR/IBIS were with the two-image F2 capability, 
and nhe POLYSCRB routine, which randomly assigns border pixels. 
However, VICAR/IBIS effectively processed all of the jatasets 
according to user needs. 

(9) A geographic information system based Universal Soil Equa- 
tion can be applied to a wide range of agricultural regions, 

similar in climate to the continental United States. Appli- 
cation to forested watersheds is not recommended because the 
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USLE was developed to predict erosion for agricultural land. 
Application of this model to other climatic zones, such as 
tropical environments, is subject to further research on 
soil erosion processes for these specific areas. 
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